Select to view content in your preferred language

Turbo Charging Data Manipulation with Python Cursors and Dictionaries

44008
59
11-08-2014 02:16 AM
RichardFairhurst
MVP Honored Contributor
47 59 44K

The Problems with Using the Field Calculator, Join and Summary Statistics tools

 

The field calculator is a great tool and easy to set up, but it has several limitations that can seriously slow down scripted routines when you need to do a lot of data manipulation with large data sets.  The most obvious limitation is that only one field can be calculated at a time.  This often means having to process every record in the feature class multiple times in order to update several fields.  If you need to do data transfers between feature classes using a join it is critical to index the join field in both feature classes to avoid calculations that take forever to process, and even then the speed is significantly slower when a join is involved.  Also, if you actually need to use multiple fields to create a join that correctly matches records between feature classes your only option is to create a new field in both feature classes that concatenates those field values to make a join that works.  Also, often before these joins can be created a summary table has to be created, which takes more time.

 

Fortunately if you have ArcGIS Desktop 10.1 or later there is an alternative that can speed scripts like this up dramatically and make data transfers between feature classes both flexible and incredibly fast.  The solution is to replace the field calculator with one or more data access cursors (da cursors) and joins and creating summary tables with one or more python dictionaries built using a da SearchCursor.

 

Using a da UpdateCursor to Replace a Field Calculation within a Single Feature Class:

 

Below is a simple script that will use a da UpdateCursor on a feature class to fill in a URL field with a hyperlink that is built using a value taken from another field.  The strftime method is used to print out the time when the script starts and the time when it finishes.  To adapt this script to your own data you only need to change the Feature Class path and name in line 7, the field names in the field list in line 9 and the URL string in line 14.  This should perform about 7 times faster than a similar field calculation.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

updateFC = r"C:\Path\FeatureClass"

updateFieldsList =  ["URLField", "ValueField"] # Example fields that will build a URL based on another field

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # updateRow[0] is the URL field and updateRow[1] is the value field

        updateRow[0] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(updateRow[1])

        updateRows.updateRow(updateRow)

print( "Finished APN_MAP field update: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a da UpdateCursor to Replace Several Field Calculations within a Single Feature Class:

 

Now lets say that we actually need to process 2 more field calculations to complete the update of this feature class.  One calculation needs to add two other fields together, and the other needs to convert the values in a field from capitalized values to title case values.  So if there are 100K records, then all of those records have to be calculated 3 times.

 

All of this can be handled in a single pass of one da UpdateCursor through the 100K records.  Because the records only have to changed once and the addition of more fields for the UpdateCursor to read or calculate takes very little additional time to process, the speed gain over 3 field calculations is much faster that the previous script.  As a result, the more field calculations you can replace with a single pass of a da UpdateCursor, the more dramatically the speed of the script will improve.

 

Below is the script that can do all three field updates (assume that there are no Null values in the fields being added together or changing case).  To use this script with your own data you still only need to change lines 7, 9, and 14 (optionally change the calculations in lines 16 and 18).

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

updateFC = r"C:\Path\FeatureClass"

updateFieldsList =  ["URLField", "ValueField", "SumField", "NumberField1", "NumberField2", "TitleCaseField"] # Example fields that will build a URL based on another field

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # updateRow[0] is the URL field and updateRow[1] is the value field

        updateRow[0] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(updateRow[1])


         # updateRow[2] is the Sumfield and updateRow[3] and updateRow[4] are the NumberField1 and NumberField2 fields. 
        updateRow[2] = updateRow[3] + updateRow[4]
         # updateRow[5] is the TitleCaseField that is having its Capitalized values replaced with Title Case values.
        updateRow[
5
] = updateRow[
5
].title()


        updateRows.updateRow(updateRow)


print( "Finished APN_MAP field update: " + strftime("%Y-%m-%d %H:%M:%S"))
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a Python Dictionary Built using a da SearchCursor to Replace a Join Connecting Two Feature Classes:

 

Another common data manipulation involves joining one feature class to another in order to transfer values from the Joined feature class to the Target feature class using the Field Calculator.  However, features loaded into a Python Dictionary using a Search Cursor are much faster at matching join field values in one feature class to another than a Join.  Combined with using a da UpdateCursor to replace the Field Calculator, the speed of these kinds of data manipulations can be even more dramatic than data manipulations on a single feature class.

 

Example 1 - Transfer of a Single Field Value between Feature Classes

 

Below is the code required to transfer the value of one feature class to another based on a common join value in another field.  To adapt this script to your own data change lines 7 and 14 to the path and name of the source (Join) and update (target) Feature Class to match the feature classes you would normally join, change lines 9 and 16 to replace the field list with the name of the Join field and the name of the field being transferred.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField"]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "ValueField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable

        keyValue = updateRow[0]

         # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
             # transfer the value stored under the keyValue from the dictionary to the updated field.
            updateRow[1] = valueDict[keyValue][0]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))


‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets

 

Just like the Field Calculator the more fields being transferred from one feature class to the other, the more your script will speed up.  If the transfer is a simple field for field transfer then the transfer fields can be handled in a for loop.  Below is an example of a transfer of 5 matching fields between two feature classes.  To adapt this code to your data you should again change the feature classes in lines 7 and 14 and the field lists in lines 9 and 16.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField1", "ValueField2", "ValueField3", "ValueField4", "ValueField5" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "ValueField1", "ValueField2", "ValueField3", "ValueField4", "ValueField5"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the values stored under the keyValue from the dictionary to the updated fields.
            for n in range (1,len(sourceFieldsList)):
                updateRow[n] = valueDict[keyValue][n-1]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Example 3 - Transfer of Multiple Field Values between Feature Classes involving Complex Field Manipulations

 

If you need to do more complex field transfers that do more than a simple transfer of data from one table to another you can do this using code similar to the example field calculations that were done on a single table.  Here are the 3 field updates of my second field calculation example done between two separate tables with common join values instead of all within a single table.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField", "NumberField1", "NumberField2", "CapitalizedField" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "UrlField1", "SumField", "TitleCaseField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
             # updateRow[1] is the URL field and valueDict[keyValue][0] is the value field
             updateRow[1] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(valueDict[keyValue][0])

             # updateRow[2] is the Sumfield and valueDict[keyValue][1] and valueDict[keyValue][2] are the NumberField1 and NumberField2 fields. 
             updateRow[2] = valueDict[keyValue][1] + valueDict[keyValue][2]
             # updateRow[3] is the TitleCaseField and valueDict[keyValue][3] is the CapitalizedField values.
             updateRow[3

] = valueDict[keyValue][3

].title()

             updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Creating a Multi-Field Python Dictionary Key to Replace a Concatenated Join Field:

 

Handling Multi-field join values that match records between two feature classes is also possible using dictionaries and cursors without having to do concatenation of separate fields into a new field.  The concatenation is still necessary, but it can be handled on the fly in memory, which is much faster than calculating a field.  Below shows an example of the single field transfer, but this time matching records using 3 separate fields to define the unique values between the feature classes used to match the records between the feature classes.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField1", "JoinField2", "JoinField3", "ValueField"]

# Use list comprehension to build a dictionary from a da SearchCursor where the key values are based on 3 separate feilds
valueDict = {str(r[0]) + "," + str(r[1]) + "," + str(r[2]):(r[3:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField1", "JoinField2", "JoinField3", "ValueField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 3 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1]) + "," + str(updateRow[2]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the value stored under the keyValue from the dictionary to the updated field.
            updateRow[3] = valueDict[keyValue][0]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))



‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Replacing Successive Joins with Simultaneous Processing of Several Python Dictionaries Built using da SearchCursors

 

When several joins have to be created and removed to do data manipulation between tables, the speed gains can be increased even more, since you can create several dictionaries and then process all of the join relationships simultaneously during a single pass of the update cursor on the updated feature class.  All source tables should be read into dictionaries before  processing the update cursor to achieve this benefit.  The more joins that you can replace this way, the more your script speed will improve.

 

Below is an example where two feature classes that would normally be joined to the updated feature class are replaced by dictionaries and the processed simultaneously by a single UpdateCursor operation.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass1"

sourceFieldsList = ["JoinField1", "ValueField", "NumberField1", "NumberField2" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict1 = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

sourceFC = r"C:\Path\SourceFeatureClass2"

sourceFieldsList = ["JoinField2", "CapitalizedField" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict2 = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"


updateFieldsList = ["JoinField1", "UrlField1", "SumField", "JoinField2", 
"TitleCaseField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        keyValue1 = updateRow[0]
        # verify that the keyValue1 is in the First Dictionary
        if keyValue1 in valueDict1:
             # updateRow[1] is the URL field and valueDict1[keyValue1][0] is the value field
             updateRow[1] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(valueDict1[keyValue1][0])

             # updateRow[2] is the Sumfield and valueDict1[keyValue1][1] and valueDict1[keyValue1][2] are the NumberField1 and NumberField2 fields. 
             updateRow[2] = valueDict1[keyValue1][1] + valueDict1[keyValue1][2]
        keyValue2 = updateRow[3]
        # verify that the keyValue2 is in the Second Dictionary
        if keyValue2 in valueDict2:

             # updateRow[4] is the TitleCaseField and valueDict2[keyValue2][0] is the CapitalizedField values.

             updateRow[4

] = valueDict2[keyValue2

][0].title()

        updateRows.updateRow(updateRow)

del valueDict1
del valueDict2

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a Python Dictionary Built using a da SearchCursor to Replace a Summary Statistics Output Table

 

Another common need is to create a summary of a feature class for the unique values of a field or fields that cover multiple records.  The Python Dictionary can be used instead of a Summary Table output to accomplish this.  The benefit is that the output is stored in memory and is much faster to create than outputting an actual summary table.  Below is an example building a summary dictionary and using it to update the source table with sum values matching that field and create a percentage field in the source table for each record that built the summary.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\UpdateFeatureClass"

sourceFieldsList = ["UniqueValuesField", "NumberField"]

# Build a summary dictionary from a da SearchCursor with unique key values of a field storing a list of the sum of that value and the record count.
valueDict = {}
with arcpy.da.SearchCursor(sourceFC, sourceFieldsList) as searchRows:
    for searchRow in searchRows:
        keyValue = searchRow[0]
        if not keyValue in valueDict:
             # assign a new keyValue entry to the dictionary storing a list of the first NumberField value and 1 for the first record counter value
            valueDict[keyValue] = [searchRow[1], 1]
        # Sum the last summary of NumberField value with the current record and increment the record count when keyvalue is already in the dictionary  
        else:
            valueDict[keyValue][0] += searchRow[1]

            valueDict[keyValue][1] += 1  


updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["UniqueValuesField", "NumberField", "PercentField", "NumberSumField", "MeanField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # divide the current record's NumberField value by the sum of the NumberField to get a percentage value

            updateRow[2] = updateRow[1] / valueDict[keyValue][0] 
            # transfer the sum of the NumberField stored under the keyValue from the dictionary.
            updateRow[3] = valueDict[keyValue][0]
            # divide the sum of the NumberField value by the record count of the NumberField to get a mean value

            updateRow[4] = valueDict[keyValue][0] / valueDict[keyValue][1]

            updateRows.updateRow(updateRow)


del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))


‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Conclusion

 

Once the patterns are understood it is possible to do virtually everything that you can do with the Field Calculator, a Join or a Summary table using an update cursor or a python dictionary.   Using these coding techniques I have been able to rewrite several scripts that did bulk Data Manipulations so that rather than taking more than an hour each to process, they now only take 3 to 10 minutes to process.  The more of these operations that you can replace using code like this, the greater the savings in processing time you will achieve.  The performance gains can be huge making it possible to keep data up to date without extensive waits or consuming large portions of your after hours batch processing time.  It also solves the problems that arise when you cannot alter the source data schema to add indexes or field concatenations and making copies of the data is not practical.

 

if you find any errors in the sample code, please let me know and I will post an update to this Blog to point out the correction. Also if you need more comments added to the code to help you understand what is going on, let me know and I will try to comment the code more fully.  These code examples are designed to work on shapefiles, file geodatabase feature classes, and unversioned SDE feature classes.  Additional code modifications are required to make them work with versioned SDE feature classes.

 

I want to express my thanks to Chris Synder who first demonstrated the usefulness of these techniques to me through the Forum and helped me understand how to apply them.

59 Comments
RichardFairhurst
MVP Honored Contributor

@LydiaRill I offer this blog and the transformations it can bring anyone's work flow completely free of charge.  Our time is the most valuable thing we have and I'm just glad you took the time to share with me Your gratitude.  With the time this has Blog has helped you regain, please share it with others that may benefit.

If you ever go to the Esri UC conference (when it's no longer just virtual) it would be great to meet you.  I'd love hear more about what this Blog has helped you to do and perhaps share more of what I've learned from the numerous ways I've used this technique in the years since I wrote this Blog.  The conference is always such a great place for all of us to draw inspiration from others who love GIS.

JoshZeeb
Occasional Contributor

Thank you so much for this resource! It's a huge help.

Though I'm currently dealing with an error when attempting the "Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets" code while using python 3.6 as my interpreter.

On the code line below I get the following error:

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

TypeError: 'Row' object is not subscriptable

After some research, I've come to the belief that indexing is not supported with the SearchCursor tool when using python 3.6 or ArcGIS Pro.

Is this true?? Are all my hopes and dreams crushed?

Edit: I found out the issue. I was using arcpy.SearchCursor instead of arcpy.da.SearchCursor...It seems to be working well now.

Mark_Hébert
Occasional Contributor

My observation: I plan to use Example #2 to update feature class records where parcel ownership has changed. New data records match field for field with a common, unchanging primary key.

I am surprised that since you published this blog back in 2014, ArcGIS (Pro) has not provide a comparable expedited analysis to update records from a source to a production feature class.

Did I miss the memo, or is your method still the gold standard for this kind of data operation?

I ask before I dive too deep into Example #2.

Thanks for your consideration.

Mark

RichardFairhurst
MVP Honored Contributor

The code works in Python 2.7 32 bit, which ArcMap uses, and in Python 3.X 64 bit which ArcGIS Pro uses.  The ArcGIS Pro Python requires that you maintain an active connection to ArcGIS Online through ArcGIS Pro to work, so whether or not you normally use ArcGIS Pro you have to periodically open it to maintain the AGOL connection for Python.

The Python version you use makes no difference in how you code the dictionary Data Transfer portion of the script.  You do have to enclose in parenthesis the outputs of all the print statements in Python 3.X to avoid an error.  The advantage of using Python 3.X for this code is that it supports access to more than 2 GB of RAM and is much less likely to fail due to an out of memory error when you work with very large datasets.

AnneSanta_Maria1
Deactivated User

@RichardFairhurst I don't know where you have been all my ESRI life but this post is the best thing!

I have a situation where the I want to modify "Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets", so that the it will run when the fields or columns are not a 1:1 match. Data for updating is coming from an excel table and will be used to update a feature service in Portal. Eventually, I want to create a tool for an end user to put in the excel table, the service to update, and set the join field, and potentially the fields to match for updating, but for now I am creating a notebook for the end user to enter information into. I am stuck on how to use a dictionary to update data in the feature service with nonmatching field names. I think I may be missing a critical step. I also wasn’t sure if I needed to use the field mapping tool from ESRI to match the fields or if the dictionary was sufficient. I may be completely off as well-my python lacks confidence.

import arcpy
import pandas as pd
arcpy.env.workspace = “myworkspace”
feature_class = “myservice”
input_table= “myexceltable”
table = pd.read_excel(input_table)
join_field_excel = "fieldA"
join_field_FC = "fieldA_differentname"
feature_class_fields =arcpy.ListFields(feature_class)
# Set the names of the Excel columns to update from
table_columns = list(table.columns)
#Create a dictionary mapping the Excel columns to the feature fields
column_mapping = {"excelfield1": "matchingservicefield", "excelfield2":"matchingservicefield2"…..}
###field_mappings= arcpy.FieldMappings()
#try:
 #   for excel_column, fc_field in column_mapping.items():
  #     field_map = arcpy.FieldMap()
 #      field_map.addInputField(feature_class, fc_field)
  #      field_map.addInputField(input_table, excel_column)
  #     field_mappings.addFieldMap(field_map)

#except Exception as e:
 #   print(arcpy.GetMessages())

try:
    with arcpy.da.UpdateCursor(feature_class,column_mapping.values()) as cursor:
        for row in cursor:
            id_value = row[join_field_FC]
            if id_value in table[join_field_excel].values:
                for excel_column,fc_field in column_mapping.items():
                    row[fc_field]=table.loc[table[join_field_excel]==id_value,table_column].values[0]
                cursor.updateRow(row)
except Exception as e:
    print(arcpy.GetMessages())

 

BlakeTerhune
MVP Regular Contributor

@AnneSanta_Maria1, you can use Richard's example #2 as-is without a field map. The update portion uses an index of the fields (the order) rather than the actual field names. So update the sourceFieldsList and updateFieldsList with your field names, but make sure they are in the same order. The rest of the code should be the same.

AnneSanta_Maria1
Deactivated User

Thank you! I was over thinking it. List position is all that matters 🙂

Kklitt
by
Occasional Contributor

@RichardFairhurst Wow, this is an incredible resource!

I was able to get my code working using example 2 of 'Using a Python Dictionary Built using a da SearchCursor to Replace a Join Connecting Two Feature Classes' and modify it slightly to iterate through an entire gdb.

Now I'm struggling with how I can get some type of export of keyValues that weren't found in the updateFC. I've tried the code below, and code line 7 gives me an export of keyValues found in the layer. Code lines 12 and 13 only exports the word 'None' instead of the actual keyValue number. Ideally, I'd like to get a message that includes the keyValue(s) not in the updateFC. It'd be a huge bonus if I could get a table of keyValues along with the rest of the attribute information from the fields not included in the updateFC as an export.

 

New code lines include 7, 12, and 13.

 

 

        with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
            for updateRow in updateRows:
                # store the Join value of the row being updated in a keyValue variable
                keyValue = updateRow[0]
                # verify that the keyValue is in the Dictionary
                if keyValue in valueDict:
                    arcpy.AddMessage("{} in {}".format(keyValue, fc))
                    # transfer the values stored under the keyValue from the dictionary to the updated fields.
                    for n in range (1,len(sourceFieldsList)):
                        updateRow[n] = valueDict[keyValue][n-1]
                    updateRows.updateRow(updateRow)
                elif keyValue not in valueDict:
                    arcpy.AddMessage("{} not in {}".format(keyValue, fc))
        del valueDict

 

 

Example output of code line 7:

122565 in Demo\cr_line_demo_dataset
122545 in Demo\cr_line_demo_dataset
122663 in Demo\cr_line_demo_dataset
122793 in Demo\cr_line_demo_dataset
122749 in Demo\cr_line_demo_dataset
122805 in Demo\cr_line_demo_dataset
122565 in cr_line
122545 in cr_line
122559 in cr_line
122535 in cr_line
122537 in cr_line
111602 in cr_line

Example output of code line 12 and 13:

None not in Demo\cr_line_demo_dataset
None not in Demo\cr_line_demo_dataset
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_point
None not in cr_point

Thank you for any insights!

RichardFairhurst
MVP Honored Contributor

Without knowing your data or what field is being referenced in the field list  it is hard to determine the cause of this behavior.  Can you confirm that the field in your field list for the update cursor is never Null?  If any record contains a Null value in the first field in your list, the behavior is what I would expect.  Potentially, you need to add the ObjectID field to your update cursor field list and use a modified index expression for the update loop to avoid updating that field, and then only use that field index in your output for the condition of an unmatched record. 

In any case, without the full field list and a clear understanding of what values are possible in the update target key field you are referencing, I can only guess at what is causing this behavior.  However, there are ways to modify the code to accomplish what you want to do, but they may require a little more sophisticated field index handling in the code than the relatively straight forward examples I provided in my Blog.

I often build a new dictionary in the condition where records are unmatched and iterate through it after the update loop is finished. I use multiple loops frequently to process an update cursor, followed by looping through a dictionary of records for an insert cursor, followed by another loop of another dictionary with an update cursor that deletes records to acheive a complete update, adds and delete synchronization process of a target to match a source.  Each of these loops have to be kept separate to avoid causing locking conflicts.  I also include logic to not touch records that are validated as already synchronized to avoid the performance hit of doing unnecessary edits with my cursor processing, which is slower than reading and comparing a record in memory.

Rich

Labels