Python Cursors and Dictionaries- I am no Turbo Mechanic!

1033
6
04-17-2018 05:00 PM
Highlighted
New Contributor II

using rfairhur24 very helpful blog on "Turbo Charging Data Manipulation with Python Cursors and Dictionaries", I have managed to create two separate libraries to compare to a cursors:

Comparison 1: Returns  records that are in table 'A' but not in Feature Class (FC) 'B' with the output being a list of fields    from table 'A' and a field stating that the record in table 'A" might need to be deleted. results in 4 records.

Comparison 2: To identify changed areas in table 'A' and FC 'B', with the output being key value and acreage from FC    'B'. results in 11 records.

using insertRow and setValue, I can get either of the cursor comparisons to write to Table 'c'. I can't get the script to write the first comparison and then write the second comparison, resulting in a change table with the two types of changes totaling 15 records.

I can hack code pretty good, but I obviously have a logic and reasoning issue when creating new code...

any help is greatly appreciated!!

Tags (2)
Reply
0 Kudos
6 Replies
Highlighted
MVP Honored Contributor

First, since you mention that you are using SetValue method, it sounds like you are using the outdated cursors that predate the da cursors, since SetValue is not necessary with da cursors.  If you are, I highly recommend using only da cursors.

Without seeing the code you have attempted, it is a little difficult to understand the logic you are attempting.  I do know that dealing with InsertCursors can be tricky, since they can cause failures if they are initialized inside of a loop or if the schemas of the two tables you are comparing are different and you did not take care to standardize the field order and number of values expected by the receiving table.

Anyway, I know it is possible to do comparisons of two data sources using the dictionary approach and process attribute or geometry updates to records that match on a key field set, delete records from a child that are no longer in the master and insert records into a child that have been added to the master since the last update.  I have a variety of scripts that do these kinds of synchronization operations in various combinations using different output options.  In my experience, problems with doing synchronizations most often arise when the code gets overly complicated and attempts to blend everything together into a single embedded loop process rather than breaking it down into separate discrete steps.

Highlighted
New Contributor II

Thanks for your reply Richard-

I was posting quickly, hoping it was a very simple answer. the old script did exactly as you say- looping through using cursors. the dictionary approach is much quicker! I just need to figure out how to write the outputs to the same change table. one other question I have: can you compare dictionaries against each other? in my instance below, I am comparing one dictionary to a search cursor result (table A dict to FC B cursor), and then doing it again (FC B dict to Table A Cursor)- I am new to this whole thing and not that clear on the logic I need to make it all work...

here is the code I am working with:

Thanks for your help!

#Variables TESTING-------------------------------------------
Table_A = r'\\StagingTesting.gdb\Table'
FC_B = r'\\StagingTesting.gdb\Table_GIS_Features'
Table_C = r'\\BackupTesting.gdb\changesTable'
Tolerance = 0.05

###Delete change table
arcpy.DeleteRows_management(Table_C)

#create cursors
Table_A_Cursor = arcpy.da.SearchCursor (Table_A,['field A','field B','field C','field D','field E'],)
FC_B_Cursor = arcpy.da.SearchCursor (FC_B,['field A','field B','field C','field D','field_B_Text'],)
rows = arcpy.InsertCursor(Table_C, ['field D', 'field C', 'field B','field A', 'field E', 'field F'])

#create dictionaries for Table A and FC B data 
table_a_data = {r[0]:r[1:] for r in arcpy.da.SearchCursor (Table_A,['field A','field B','field C','field D','field E'],)}
FC_B_data = {r[0]:r[1:] for r in arcpy.da.SearchCursor (FC_B,['field A','field B','field C','field D'],)}

#loop through FC B find acreage updates to Table A
for row in Table_A_Cursor:
    table_a_data[row[0]] = row[1]
   
for row in FC_B_Cursor:
    if abs(row [1]-table_a_data[row[0]])>Tolerance:
        insert = rows.newRow()
        insert.setValue("field D", row[3])
        insert.setValue("field C", row[2])
        insert.setValue("'field B'", row[4])
        #insert.setValue("Fiel A", row[0])
        #insert.setValue("field E", table_a_data[3])
        insert.setValue("field F", 'NEW ACREAGE:Update Acres')
        rows.insertRow(insert)
        del row

# find records in Table A not in FC B
for row2 in FC_B_Cursor:
    FC_B_data[row2[0]] = row2[1]

#loop through table A
for row2 in Table_A_Cursor:
    if not row2[0] in FC_B_data:
        insert = rows2.newRow()
        insert.setValue("field D", row2[3])
        insert.setValue("field C", row2[2])
        insert.setValue("'field B'", row2[1])
        #insert.setValue('field A', row2[0])
        insert.setValue("field E", row2[4])
        insert.setValue("field F", 'NOT IN GIS:DELETE FROM Table_A?')
        rows.insertRow(insert)
        del row2

del rows
Reply
0 Kudos
Highlighted
MVP Honored Contributor

After looking at your code more carefully I withdraw my original comment.

Where are you getting errors with your current code?

Line 19 through 21 and Lines 33 through 35 seem unnecessary, since you have already read all records from those tables into the dictionaries you created on lines 15 and 16.  If you removed those unnecessary lines you could do the comparison on line 23 using this code:

    if abs(row [1]-table_a_data[row[0]][0])>Tolerance:
Reply
0 Kudos
Highlighted
MVP Honored Contributor

I think the real error is occurring because you are trying to create a newRow object using the SearchCursor row2 on line 42, which won't work.  Line 42 needs to use the InsertCursor rows to create a newRow:

        insert = rows.newRow()‍‍ # Line 42‍‍‍‍‍‍
Highlighted
New Contributor II

cleaning up the code per your suggestions worked. there was not an error, it was not combining the two comparison outputs to the change table.

I don't remember where I got the syntax for lines 19-21... could you explain the second half of your line 23 to me? why the second bracketed zero in the '...table_a-data[row [0]][0]...'? the unique ID field I am comparing for both cursor and dictionary is [0] whereas the acreage to compare is for both is [1]. how does 'table_a-data [row[0]][0]' equate to [1]?

so I am using cursors and dictionaries... could I have done this just comparing the dictionaries?

Thank you for your help- I really appreciate your time!

Reply
0 Kudos
Highlighted
MVP Honored Contributor

Glad it is working now and that both tables are now combined in the final output.

The dictionary created in Line 16 was being changed after you ran Lines 19-21.  The keys were the same, but the values associated with the keys are very different. 

In Line 16 the value associated with each dictionary key is a list containing values from Field B, Field C, Field D and Field E.  After running Line 16, if you use the code

table_a_data[row[0]]‍‍‍

the dictionary will return the list of field values, not a single field value.  In order to get the value from just Field B from the list, a second set of brackets needs to be added with the index number of the value for Field B in the list (i.e. [0]).  So, to directly call the value of Field B from the dictionary in one line of code you need to use

table_a_data[row[0]][0]

The list of field values associated with each dictionary key in Line 16 no longer existed after running lines 19-21.  Lines 19-21 was changing the value associated with each dictionary key to being the value of Field B only.  Therefore, after you ran Lines 19-21 you just needed to return the value associated with the dictionary key to get the value of Field B directly.

You could have used two dictionaries to do the comparison in Lines 40-50, but you would need to use the dictionary created in Line 16, not the dictionary as modified by Lines 19-21.  Instead of doing a for loop of the rows of the  Table_A_Cursor, you would do a for loop of the keys of the table_a_data dictionary.  Instead of assigning values to each field from row2 returned by the cursor, you would have assigned values to each field using the appropriate list index on the list returned from the table_a_data dictionary for each unmatched key, i.e.

for key in table_a_dict:
    if not key in FC_B_data:
        insert = rows.newRow()
        insert.setValue("field D", table_a_dict[key][2])
        insert.setValue("field C", table_a_dict[key][1])
        insert.setValue("field B", table_a_dict[key][0])
        insert.setValue("field E", table_a_dict[key][3])
        insert.setValue("field F", 'NOT IN GIS:DELETE FROM Table_A?')
        rows.insertRow(insert)
‍‍‍‍‍‍‍‍‍

 

Reply
0 Kudos