Update existing rows/add new rows in master table from records in second table

MarkHuffman · ‎02-13-2012

I have a table that is a subset of a master table representing the parcel database. This subset is created every week and contains only updated and new records, and my task is to compare that table to the master table and, if existing records have different values, update those fields, and if there are new records, add those records to the master table.

This process has up to now been done in SDE using "sdeimport -o update_else_insert." I want to avoid interacting with SDE, but will if necessary.

My plan is to use SearchCursor to read the subset and the master table, and UpdateCursor and InsertCursor to edit the master table, using the APN as the key field. First, is this the right approach? Second, can someone please provide some starting code or psuedocode to help me? I am new at Python but not at programming. Thank you.

StacyRendall1 · ‎02-16-2012

Well... Learning Python is one thing; learning Arcpy is something else altogether!

For Python; I think any book is good for the basics, then the rest is time and experience. The web is the resource I use most often, but also books on advanced things (such as GUIs) really help you learn the more complex stuff (even if not directly related). Although, the web help often tends to stupid oversimplification that makes it really hard to actually find what you want. It is something you are constantly learning, and whenever I look at a code from even a few months ago I am always surprised at how much I do different now to what I did then!

Arcpy was incredibly frustrating when I first started to use it. It is basically using Python to glue ArcGIS operations together in sequence, but you can't always take advantage of the cool things that Python lets you do, as you are so constrained by what Arc can (not) do. Your biggest friend, again, will be the Internet; googling 'arcpy thing' will usually turn up something (for example: listing fields in tables)...

The goal of my blog is to make it easier for Geographers to learn Python and easier for Programmers to learn to use Arc/Geographic programming interfaces, but I will be too busy over the next few months to make much progress with that. Keep an eye on it in future though!

Cheers,
Stacy

MarkHuffman · ‎02-17-2012

Regarding the use of

 testFields = arcpy.ListFields(masterTable)

instead of

testFields = ['APN_SUFFIX', 'TRA', 'NAME_1', 'NAME_2', 'MAIL_ADDR', 'CTY_STA']

: when I use the former I get the dread ERROR 999999 at the line

subsetDict[row.getValue(keyField)] = [row.getValue(field) for field in testFields]

What am I doing wrong?

Thanks.

Mark

P.S. I ran the script on the entire 262,000 + table for the six fields shown above, and it took 13 minutes in the IDLE GUI. The old method, a korn shell script interacting with SDE, took about 10, for all 100 + fields. (I first tried running it in the ArcCatalog Python window, and it is still running more than 2 hours later. YIKES!)

JakeSkinner · ‎02-17-2012

Hi Mark,

You will need to append the field name to the 'testFields' list. Try:

testFields = []

for fields in arcpy.ListFields(masterTable):
    testFields.append(fields.name)

MarkHuffman · ‎02-17-2012

Hi Mark,

You will need to append the field name to the 'testFields' list. Try:
testFields = []

for fields in arcpy.ListFields(masterTable):
    testFields.append(fields.name)

Hi Jake,

That did it. Thanks a lot! (And thanks for answering so quickly.)

Mark

MarkHuffman · ‎02-17-2012

Well, I'm getting the RuntimeError "Field is not editable" at

 row.setValue(field, subsetDict[key])

I've seen this only since I've changed the code to use the arcpy.ListFields method (see above).

Since the code is now accessing all fields, could it be that the OID (which is indeed not editable) has different values in the two tables that is source of the error? If so, what is the workaround?
Thank you.

Mark

StacyRendall1 · ‎02-19-2012

Mark,

get it to print out the field names (either with print or arcpy.AddMessage() ) just before the line that errors; so the last one to print out before your error occurs will be the field causing the problem. If it is OID field, you can do something like this to remove it from the analysis:

keyFields = arcpy.ListFields(masterTable)
oidIndex = keyFields.index('OBJECTID') # or whatever the offending field is...
del keyFields[oidIndex]

Let me know what it is - I will update the post with the 'final' code to reflect this.

Cheers,
Stacy

MarkHuffman · ‎02-22-2012

Hi Stacy,

I took your advice and it worked (of course).

Here is how I printed the field names, in the InsertCursor block:

for i in range(len(otherFields)):
     field = otherFields
     print fields.name
     row.setValue(field, subsetDict[key])

It printed out only the first field name before it errored out - it was OBJECTID (even though in the table it appears as OID).

So I adapted your sample code to remove that field from analysis and placed it after the # define fields region:

# define fields
keyField = 'APN'
otherFields = []
for fields in arcpy.ListFields(subsetTable):
    otherFields.append(fields.name)
    
oidIndex = otherFields.index('OID')
del otherFields[oidIndex]

Now the script runs to completion. Thank you!

Mark