Calculate field based on substrings within a string contained in another field

NeilWrobel · ‎05-02-2014

Hi

I am in need of a python code block that I can place into the calculate field tool in model builder.

I have a field called REF_ID.

REF_ID contains a string which is essentially just a list of semi-colon separated codes eg 59333; 10203; 61297; 12345 etc.

From REF_ID I want another field called DATA_SOURCE to be calculated.

DATA_SOURCE needs to be populated with text descriptions of particular codes, not all of the codes, within the REF_ID field.

So, I think I need to set up a library of the specific codes and their corresponding descriptions.

59333 Gravity
59334 Magnetics
61297 Radar
61299 Landsat
62227 Fieldwork

The python code then needs to search the string in the REF_ID field and look for the occurrence of any of the codes listed above.
Not all will be present and of those that are present, they should only occur once.

The DATA_SOURCE field should then be populated with a semi-colon separated list of the corresponding descriptions.
eg. Gravity; Radar; etc.

Does anyone have a good solution for this?

Much obliged.

Neil Wrobel

JakeSkinner · ‎05-02-2014

Hi Neil,

You could use the update cursor to do this. Ex:

import arcpy, sys
from arcpy import env
env.workspace = r"C:\temp\python\test.gdb"

fc = 'gages'
fields = ['REF_ID', 'DATA_SOURCE']

with arcpy.da.UpdateCursor(fc, ["REF_ID", "DATA_SOURCE"]) as cursor:
    for row in cursor:
        if row[0] == 59333:
            row[1] = 'Gravity'
        elif row[0] == 59334:
            row[1] = 'Magnetics'
        elif row[0] == 61297:
            row[1] = 'Radar'
        ....
            .....
        cursor.updateRow(row)

markdenil · ‎05-02-2014

I would recomend an update cursor as well,
but to use a dictionary instead of all the elifs.

Too, you mention that the field value for 'REF_ID' is a semi-colon delimited list
and you want 'DATA_SOURCE' to be a similar list

You could incorporate something like this quick sketch:

aDict = {'59333': "Gravity",
        '59334': "Magnetics",
        '61297': "Radar",
        '61299': "Landsat",
        '62227': "Fieldwork"
        }

idVal = "59333;59334;61297;9999;61299;62227"
idList = idVal.split(';')
outStr = ''
for val in idList:
    if val in aDict.keys():
        word = aDict[val]
    else:
        word = "Unknown"
    if len(outStr) > 0:
        word = "; %s" % (word)
    outStr = "%s%s" % (outStr, word)

    print '\n', outStr

idVal is the 'REF_ID' value you obtain from the cursor for that row
The outStr (the final one, this code reports it for every itteration....)
is what you use to update 'DATA_SOURCE'

markdenil · ‎05-02-2014

I would recomend an update cursor as well,
but to use a dictionary instead of all the elifs.

Too, you mention that the field value for 'REF_ID' is a semi-colon delimited list
and you want 'DATA_SOURCE' to be a similar list

You could incorporate something like this quick sketch:

aDict = {'59333': "Gravity",
        '59334': "Magnetics",
        '61297': "Radar",
        '61299': "Landsat",
        '62227': "Fieldwork"
        }

idVal = "59333;59334;61297;9999;61299;62227"
idList = idVal.split(';')
outStr = ''
for val in idList:
    if val in aDict.keys():
        word = aDict[val]
    else:
        word = "Unknown"
    if len(outStr) > 0:
        word = ";%s" % (word)
    outStr = "%s%s" % (outStr, word)

    print '\n', outStr

idVal is the 'REF_ID' value you obtain from the cursor for that row
The outStr (the final one, this code reports it for every itteration....)
is what you use to update 'DATA_SOURCE'