Split text field into two new fields?

DavidMedeiros · ‎11-24-2010

Hi,

I'm a GIS MA student looking for some help on creating a Python script in ArcGIS for a class project. I have a text table with several fields (Tenant Name; Address; Sq Ft; Destination). The destination column contains information that I want to separate into two new fields - one for the first word in the original column and one for the second word in the original column. I then need to write this new info back into the original table.

What ArcPy commands should I be looking at to accomplish this? Can anyone point me to a similar finished script?

Thanks,

David

JasonScheirer · ‎11-24-2010

You'll want to use AddField to add the new field for splitting, then use two CalculateValue calls -- one to populate the new field and another to truncate down the old field into the first word.

ChrisSnyder · ‎11-24-2010

If you want:
ORIG_FIELD = "My House"
SPLIT1 = "My"
SPLIT2 = "House"

The Python syntax for the Calculate Field tool would be:
SPLIT1 = !ORIG_FIELD!.split(" ")[0] #0 being the 1st word
SPLIT1 = !ORIG_FIELD!.split(" ")[-1] #-1 being the last word

JasonScheirer · ‎11-24-2010

If you have records with multiple spaces this will fall down. There's an optional second argument to string.split to limit the number of splits performed:

SPLIT1 = !ORIG_FIELD!.split(" ", 1)[0] #0 being the 1st word
SPLIT2 = !ORIG_FIELD!.split(" ", 1)[-1] #-1 being the last word

And then you may only have one token, in which case you can use a conditional variable to rightly leave the second field as blank (otherwise, [0] and [-1] will be pointing to the same index in the list, meaning both fields are populated with the same value). So a more refined version would be:

SPLIT1 = !ORIG_FIELD!.split(" ", 1)[0]
SPLIT2 = !ORIG_FIELD!.split(" ", 1)[1] if " " in !ORIG_FIELD! else ""

DavidMedeiros · ‎11-24-2010

I should probably have waited to post this question until after the holiday so I could be trying this out in the lab, but this looks like it will be a great help, thank you!

All records in that field have single space, but some do only have one entry instead of two.

DavidMedeiros · ‎12-01-2010

Alright here's where I've gotten to on this:

[INDENT]# split table script

#add field step (keep)
import arcpy

from arcpy import env
inputTable = arcpy.GetParameterAsText(0)
inputField = arcpy.GetParameterAsText(1) #"Dest" column, set in toolbox properties

#list fields (stops duplicate add fields)
theFieldList = arcpy.ListFields (inputTable, "Type")
arcpy.AddMessage ("the count" + str(len(theFieldList)))

#following exectues the add field (keep)
if len(theFieldList) == 0:
arcpy.AddField_management(inputTable, "Type", "TEXT")
else:
arcpy.AddMessage ("Field Exists")

#split field loop
theItemList = arcpy.SearchCursor (inputTable) #gets all vlaues for table

for thisItem in theItemList: #loops through vlaues
thisValue = thisItem.getValue (inputField)
split = !thisValue!.split(" ",1)[-1] if " " in !thisValue! else ""
setValue...[/INDENT]

The add field portion works, I'm not sure if I'm going in the right direction with the loop above and I don't know what the syntax is to set the value back into my new field. Does this look correct (aside from being incomplete)?

ChrisSnyder · ‎12-01-2010

If you want to write new data to the table you would need to use an update cursor (not a search cursor). Also, you have to use cursor syntax, not field calculator syntax.

DavidMedeiros · ‎12-07-2010

If you want to write new data to the table you would need to use an update cursor (not a search cursor). Also, you have to use cursor syntax, not field calculator syntax.

So I would drop the calculate field "split" command? Or use it together with the update cursor?

What I have so far is a new field, into which I want to place the second value from another filed (forget about truncating first field, not needed).

DavidMedeiros · ‎12-07-2010

Ok, I'm successfully updating the row - bigger achievement for me than you might think! 😉 But, I'm only getting the last character instead of the second word from the original row. Here's my script:

# split table script

#add field step (keep)
import arcpy

from arcpy import env
inputTable = arcpy.GetParameterAsText(0)
arcpy.AddMessage ("input " + inputTable)

#list fields (stops duplicate add fields)
theFieldList = arcpy.ListFields (inputTable, "Type")
arcpy.AddMessage ("the count" + str(len(theFieldList)))

#following exectues the add field (keep)
if len(theFieldList) == 0:
    arcpy.AddField_management(inputTable, "Type", "TEXT")

else:
    arcpy.AddMessage ("Field Exists")

#split field loop
rows = arcpy.UpdateCursor (inputTable) #gets all vlaues for table

for row in rows: #loops through vlaues
    row.Type = row.Dest[-1]
    rows.updateRow(row)

del row
del rows

I know I need to change the modifier on the row.Dest[-1] syntax but I thought the [-1] gave me the second word?

ChrisSnyder · ‎12-07-2010

No, -1 gives you the last value (which may coincidentally be the 2nd item if there are only two items). For example:

list = ["a","b","c","d"]

list[0] = "a"
list[1] = "b"
list[2] = "c"
list[3] = "d"
list[4] = ERROR! Index out of range...

list[-1] = "d"
list[-2] = "c"
list[-3] = "b"
list[-4] = "a"

list[0:2] = ["a","b"]