Solved! Go to Solution.
I'll go with the append.
#UNTESTED! lutTbl = r"C:\temp\test.gdb\lookuptable" mainTbl = "r"C:\temp\test.gdb\maintable" lutDict= dict([(r[0], (r[1], r[2])) for r in arcpy.da.SearchCursor(lutTbl, ["NAME","ADDRESS","CITY")]) arcpy.AddField_managment(mainTbl, "ADDRESS", "TEXT", "", "", "75") arcpy.AddField_managment(mainTbl, "CITY", "TEXT", "", "", "30") updateRows = arcpy.da.UpdateCursor(mainTbl, ["NAME","ADDRESS","CITY"]) for updateRow in updateRows: nameValue = updateRow[0] if nameValue in lutDict: updateRow[1] = lutDict[nameValue][0] #Address updateRow[2] = lutDict[nameValue][1] #City else: print "Could not locate address/city info for " + str(nameValue) updateRows.updateRow(updateRow) del updateRow, updateRows
Ok I have a feature class (260k records) and 3 tables. I need to get specific information (about 20 fields) from each into a new feature class. I have several different ways to do this and I�??m curious if anyone has done any performance testing and knows which of my options would be �??best�?�.
Option 1
I could create several joins and append the data into the new feature class, with some additional field calculations for cleanup.
Option 2
Create a python script that would loop through each record, search cursor the other tables for data, then insert cursor into the new feature class.
Thank you
Alan
Assuming the join fields are indexed on all of the tables, option 1 blew away cursors under the pre-da version cursors. Have not tested the da version cursors. However, even ArcObjects cursors perform slower than joins, so I doubt Option 2 would work faster even with a da cursor. Hitting tables with repeated queries is much slower than a join, which does just one correlation of the tables, because each new query has to reset the table read, as far as I know. Main reason to not use joins would be if there was a 1:M or M:M relationship to traverse.
I'll go with the append.
#UNTESTED! lutTbl = r"C:\temp\test.gdb\lookuptable" mainTbl = "r"C:\temp\test.gdb\maintable" lutDict= dict([(r[0], (r[1], r[2])) for r in arcpy.da.SearchCursor(lutTbl, ["NAME","ADDRESS","CITY")]) arcpy.AddField_managment(mainTbl, "ADDRESS", "TEXT", "", "", "75") arcpy.AddField_managment(mainTbl, "CITY", "TEXT", "", "", "30") updateRows = arcpy.da.UpdateCursor(mainTbl, ["NAME","ADDRESS","CITY"]) for updateRow in updateRows: nameValue = updateRow[0] if nameValue in lutDict: updateRow[1] = lutDict[nameValue][0] #Address updateRow[2] = lutDict[nameValue][1] #City else: print "Could not locate address/city info for " + str(nameValue) updateRows.updateRow(updateRow) del updateRow, updateRows
I'll go with the append.
Don't!
Going the Python route (reading the join tables(s) into a dictionary using a search cursor and then updating the main table via an update cursor is by far the fastest method. This is true in v10.0 and below, but is especially true in v10.1+ using ethe data access cursors. In addition to faster processing, this method is far more flexible in that allows for all sorts of error handeling and whatnot through conditional expressions.
For example, say you want to get the fields "ADDRESS" and "CITY" into the main table (key field being "NAME"):#UNTESTED! lutTbl = r"C:\temp\test.gdb\lookuptable" mainTbl = "r"C:\temp\test.gdb\maintable" lutDict= dict([(r[0], (r[1], r[2])) for r in arcpy.da.SearchCursor(lutTbl, ["NAME","ADDRESS","CITY")]) arcpy.AddField_managment(mainTbl, "ADDRESS", "TEXT", "", "", "75") arcpy.AddField_managment(mainTbl, "CITY", "TEXT", "", "", "30") updateRows = arcpy.da.UpdateCursor(mainTbl, ["NAME","ADDRESS","CITY"]) for updateRow in updateRows: nameValue = updateRow[0] if nameValue in lutDict: updateRow[1] = lutDict[nameValue][1] #Address updateRow[2] = lutDict[nameValue][2] #City else: print "Could not locate address/city info for " + str(nameValue) updateRows.updateRow(updateRow) del updateRow, updateRows
updateRow[1] = lutDict[nameValue][1]I had to use
row[1] = PADataDict[PKValue][0]It was almost like once it got into the if statement the index 0 was the first value, not the key.
Anyway - dictionaries are great. I do all my fancy SQL-like stuff in them now... especially now with 64-bit arcpy. Data tables have kazillions records? No problem and crazy fast!!! No need for fancy RDBMs...