Update existing rows/add new rows in master table from records in second table

MarkHuffman · ‎02-13-2012

I have a table that is a subset of a master table representing the parcel database. This subset is created every week and contains only updated and new records, and my task is to compare that table to the master table and, if existing records have different values, update those fields, and if there are new records, add those records to the master table.

This process has up to now been done in SDE using "sdeimport -o update_else_insert." I want to avoid interacting with SDE, but will if necessary.

My plan is to use SearchCursor to read the subset and the master table, and UpdateCursor and InsertCursor to edit the master table, using the APN as the key field. First, is this the right approach? Second, can someone please provide some starting code or psuedocode to help me? I am new at Python but not at programming. Thank you.

StacyRendall1 · ‎02-14-2012

Stacy,

I have sent you a private message to arrange to send the data by more secure means. Thanks.

Mark

Thanks Mark, I received your message but for some reason I cannot reply or send you a new Private Message (seems to send, but nothing appears in the the Sent Items folder, and you obviously haven't received anything).

Regards,
Stacy

MathewCoyle · ‎02-15-2012

Hate to barge in on a good conversation, but it looks like you are missing an update in your update cursor.

for row in uC:
 if row.getValue(keyField) in updateDict:
  row.setValue(testField1, updateDict[key][0])
  row.setValue(testField2, updateDict[key][1])
  row.setValue(testField3, updateDict[key][2])
  row.setValue(testField4, updateDict[key][3])
  row.setValue(testField5, updateDict[key][4])
  row.setValue(testField6, updateDict[key][5])
  uC.updateRow(row)

MarkHuffman · ‎02-15-2012

Hate to barge in on a good conversation, but it looks like you are missing an update in your update cursor.

for row in uC:
 if row.getValue(keyField) in updateDict:
  row.setValue(testField1, updateDict[key][0])
  row.setValue(testField2, updateDict[key][1])
  row.setValue(testField3, updateDict[key][2])
  row.setValue(testField4, updateDict[key][3])
  row.setValue(testField5, updateDict[key][4])
  row.setValue(testField6, updateDict[key][5])
  uC.updateRow(row)

Hi Mathew,

Feel free to barge in any time. Thanks for the catch.

I'm still getting the error, which is occurring before uC.UpdateRow.

Mark

JakeSkinner · ‎02-15-2012

I was executing code similar to this. Have you tried declaring the 'key' in the update cursor. Ex:

for row in uC:
        key = row.getValue(keyField)
 if row.getValue(keyField) in updateDict:
  row.setValue(testField1, updateDict[key][0])
  row.setValue(testField2, updateDict[key][1])
  row.setValue(testField3, updateDict[key][2])
  row.setValue(testField4, updateDict[key][3])
  row.setValue(testField5, updateDict[key][4])
  row.setValue(testField6, updateDict[key][5])
  uC.updateRow(row)

MarkHuffman · ‎02-15-2012

I was executing code similar to this. Have you tried declaring the 'key' in the update cursor. Ex:

for row in uC:
        key = row.getValue(keyField)
 if row.getValue(keyField) in updateDict:
  row.setValue(testField1, updateDict[key][0])
  row.setValue(testField2, updateDict[key][1])
  row.setValue(testField3, updateDict[key][2])
  row.setValue(testField4, updateDict[key][3])
  row.setValue(testField5, updateDict[key][4])
  row.setValue(testField6, updateDict[key][5])
  uC.updateRow(row)

Hi Jake,

That was the final piece. My script is working now! Actually our script.

Thank you, Jake, Mathew, and especially Stacy.

Mark

StacyRendall1 · ‎02-15-2012

Hi Mark,

that's great news! Writing is the easy part - it's fixing up all the little mistakes that usually takes the majority of the time, as you have seen!

How does that script compare with your former method; speed, reliability, etc.?

Regards,
Stacy

StacyRendall1 · ‎02-15-2012

Also, if anyone is interested, here is the code to perform this operation for arbitrary tables (fields only defined once at the start):

import arcpy from arcpy import env  # Define workspace env.workspace = "C:/PythonParcelProcesses"  # paths to tables masterTable = 'UpdateInsertTo.dbf' subsetTable = 'UpdateInsertFrom.dbf'  # define fields: keyField = 'APN' testFields = ['APN_SUFFIX', 'TRA', 'NAME_1', 'NAME_2', 'MAIL_ADDR', 'CTY_STA']  # dictionary of the contents of the subset table # APN is key, fields to test are stored in an ordered list subsetDict = {} sC = arcpy.SearchCursor(subsetTable) for row in sC:  subsetDict[row.getValue(keyField)] = [row.getValue(field) for field in testFields] del sC, row  # get list of values that need to be updated updateDict = {} sC = arcpy.SearchCursor(masterTable) for row in sC:  key = row.getValue(keyField)  testList = [row.getValue(field) for field in testFields]   # try to get record in subsetDict to see if it has changed  # if the subset table didn't contain a record for this entry, will trow KeyError (that's fine - do nothing)  try:   # if records have changed add to updateDict and delete from subsetDict   if subsetDict[key] != testList:    updateDict[key] = subsetDict[key]    del subsetDict[key]   # records are the same, delete from subsetDict   else:    del subsetDict[key]  except KeyError:   # this APN only exists in the master table, do nothing...   pass del sC, row  # anything left in subsetDict is new, can use insertCursor iC = arcpy.InsertCursor(masterTable) for key in subsetDict:  row = iC.newRow()  row.setValue(keyField, key)  for i in range(len(testFields)):   field = testFields   row.setValue(field, subsetDict[key])  iC.insertRow(row)  del row del iC  # anything in updateDict needs to be updated, use updateCursor uC = arcpy.UpdateCursor(masterTable) for row in uC:  key = row.getValue(keyField)  if row.getValue(keyField) in updateDict:   for i in range(len(testFields)):    field = testFields    row.setValue(field, updateDict[key])   uC.updateRow(row) del uC, row, updateDict, subsetDict

MarkHuffman · ‎02-16-2012

Also, if anyone is interested, here is the code to perform this operation for arbitrary tables (fields only defined once at the start):

Thanks, Stacy. This is what I was looking for - a way to define the fields once at the beginning and be done with it. I've got over a hundred total.:D

How does that script compare with your former method; speed, reliability, etc.?

I expect it will run faster and be more reliable, as well as be easier to edit and update.

Regards,
Mark

StacyRendall1 · ‎02-16-2012

Hi Mark,

It is not even necessary for you to list all the fields yourself, arcpy.ListFields(table) should do the trick. The purpose of what I had was (started off doing it, then forgot to carry it through) that you could have certain fields that were tested for matching while the other fields were just copied if there was a difference in the test fields, and of course for new entries...

Regards,
Stacy

MarkHuffman · ‎02-16-2012

Hi Mark,

It is not even necessary for you to list all the fields yourself, arcpy.ListFields(table) should do the trick. The purpose of what I had was (started off doing it, then forgot to carry it through) that you could have certain fields that were tested for matching while the other fields were just copied if there was a difference in the test fields, and of course for new entries...

Regards,
Stacy

I thought there had to be a way to access an existing list: the table itself. Thanks!

I really need to get more proficient at this. Do you recommend any books? I have "Learning Python," by Mark Lutz (O'Reilly).

Mark