Select to view content in your preferred language

Turbo Charging Data Manipulation with Python Cursors and Dictionaries

50289
91
11-08-2014 02:16 AM
RichardFairhurst
MVP Alum
47 91 50.3K

The Problems with Using the Field Calculator, Join and Summary Statistics tools

 

The field calculator is a great tool and easy to set up, but it has several limitations that can seriously slow down scripted routines when you need to do a lot of data manipulation with large data sets.  The most obvious limitation is that only one field can be calculated at a time.  This often means having to process every record in the feature class multiple times in order to update several fields.  If you need to do data transfers between feature classes using a join it is critical to index the join field in both feature classes to avoid calculations that take forever to process, and even then the speed is significantly slower when a join is involved.  Also, if you actually need to use multiple fields to create a join that correctly matches records between feature classes your only option is to create a new field in both feature classes that concatenates those field values to make a join that works.  Also, often before these joins can be created a summary table has to be created, which takes more time.

 

Fortunately if you have ArcGIS Desktop 10.1 or later there is an alternative that can speed scripts like this up dramatically and make data transfers between feature classes both flexible and incredibly fast.  The solution is to replace the field calculator with one or more data access cursors (da cursors) and joins and creating summary tables with one or more python dictionaries built using a da SearchCursor.

 

Using a da UpdateCursor to Replace a Field Calculation within a Single Feature Class:

 

Below is a simple script that will use a da UpdateCursor on a feature class to fill in a URL field with a hyperlink that is built using a value taken from another field.  The strftime method is used to print out the time when the script starts and the time when it finishes.  To adapt this script to your own data you only need to change the Feature Class path and name in line 7, the field names in the field list in line 9 and the URL string in line 14.  This should perform about 7 times faster than a similar field calculation.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

updateFC = r"C:\Path\FeatureClass"

updateFieldsList =  ["URLField", "ValueField"] # Example fields that will build a URL based on another field

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # updateRow[0] is the URL field and updateRow[1] is the value field

        updateRow[0] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(updateRow[1])

        updateRows.updateRow(updateRow)

print( "Finished APN_MAP field update: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a da UpdateCursor to Replace Several Field Calculations within a Single Feature Class:

 

Now lets say that we actually need to process 2 more field calculations to complete the update of this feature class.  One calculation needs to add two other fields together, and the other needs to convert the values in a field from capitalized values to title case values.  So if there are 100K records, then all of those records have to be calculated 3 times.

 

All of this can be handled in a single pass of one da UpdateCursor through the 100K records.  Because the records only have to changed once and the addition of more fields for the UpdateCursor to read or calculate takes very little additional time to process, the speed gain over 3 field calculations is much faster that the previous script.  As a result, the more field calculations you can replace with a single pass of a da UpdateCursor, the more dramatically the speed of the script will improve.

 

Below is the script that can do all three field updates (assume that there are no Null values in the fields being added together or changing case).  To use this script with your own data you still only need to change lines 7, 9, and 14 (optionally change the calculations in lines 16 and 18).

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

updateFC = r"C:\Path\FeatureClass"

updateFieldsList =  ["URLField", "ValueField", "SumField", "NumberField1", "NumberField2", "TitleCaseField"] # Example fields that will build a URL based on another field

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # updateRow[0] is the URL field and updateRow[1] is the value field

        updateRow[0] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(updateRow[1])


         # updateRow[2] is the Sumfield and updateRow[3] and updateRow[4] are the NumberField1 and NumberField2 fields. 
        updateRow[2] = updateRow[3] + updateRow[4]
         # updateRow[5] is the TitleCaseField that is having its Capitalized values replaced with Title Case values.
        updateRow[
5
] = updateRow[
5
].title()


        updateRows.updateRow(updateRow)


print( "Finished APN_MAP field update: " + strftime("%Y-%m-%d %H:%M:%S"))
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a Python Dictionary Built using a da SearchCursor to Replace a Join Connecting Two Feature Classes:

 

Another common data manipulation involves joining one feature class to another in order to transfer values from the Joined feature class to the Target feature class using the Field Calculator.  However, features loaded into a Python Dictionary using a Search Cursor are much faster at matching join field values in one feature class to another than a Join.  Combined with using a da UpdateCursor to replace the Field Calculator, the speed of these kinds of data manipulations can be even more dramatic than data manipulations on a single feature class.

 

Example 1 - Transfer of a Single Field Value between Feature Classes

 

Below is the code required to transfer the value of one feature class to another based on a common join value in another field.  To adapt this script to your own data change lines 7 and 14 to the path and name of the source (Join) and update (target) Feature Class to match the feature classes you would normally join, change lines 9 and 16 to replace the field list with the name of the Join field and the name of the field being transferred.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField"]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "ValueField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable

        keyValue = updateRow[0]

         # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
             # transfer the value stored under the keyValue from the dictionary to the updated field.
            updateRow[1] = valueDict[keyValue][0]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))


‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets

 

Just like the Field Calculator the more fields being transferred from one feature class to the other, the more your script will speed up.  If the transfer is a simple field for field transfer then the transfer fields can be handled in a for loop.  Below is an example of a transfer of 5 matching fields between two feature classes.  To adapt this code to your data you should again change the feature classes in lines 7 and 14 and the field lists in lines 9 and 16.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField1", "ValueField2", "ValueField3", "ValueField4", "ValueField5" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "ValueField1", "ValueField2", "ValueField3", "ValueField4", "ValueField5"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the values stored under the keyValue from the dictionary to the updated fields.
            for n in range (1,len(sourceFieldsList)):
                updateRow[n] = valueDict[keyValue][n-1]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Example 3 - Transfer of Multiple Field Values between Feature Classes involving Complex Field Manipulations

 

If you need to do more complex field transfers that do more than a simple transfer of data from one table to another you can do this using code similar to the example field calculations that were done on a single table.  Here are the 3 field updates of my second field calculation example done between two separate tables with common join values instead of all within a single table.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField", "ValueField", "NumberField1", "NumberField2", "CapitalizedField" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField", "UrlField1", "SumField", "TitleCaseField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
             # updateRow[1] is the URL field and valueDict[keyValue][0] is the value field
             updateRow[1] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(valueDict[keyValue][0])

             # updateRow[2] is the Sumfield and valueDict[keyValue][1] and valueDict[keyValue][2] are the NumberField1 and NumberField2 fields. 
             updateRow[2] = valueDict[keyValue][1] + valueDict[keyValue][2]
             # updateRow[3] is the TitleCaseField and valueDict[keyValue][3] is the CapitalizedField values.
             updateRow[3

] = valueDict[keyValue][3

].title()

             updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Creating a Multi-Field Python Dictionary Key to Replace a Concatenated Join Field:

 

Handling Multi-field join values that match records between two feature classes is also possible using dictionaries and cursors without having to do concatenation of separate fields into a new field.  The concatenation is still necessary, but it can be handled on the fly in memory, which is much faster than calculating a field.  Below shows an example of the single field transfer, but this time matching records using 3 separate fields to define the unique values between the feature classes used to match the records between the feature classes.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass"

sourceFieldsList = ["JoinField1", "JoinField2", "JoinField3", "ValueField"]

# Use list comprehension to build a dictionary from a da SearchCursor where the key values are based on 3 separate feilds
valueDict = {str(r[0]) + "," + str(r[1]) + "," + str(r[2]):(r[3:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["JoinField1", "JoinField2", "JoinField3", "ValueField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 3 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1]) + "," + str(updateRow[2]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the value stored under the keyValue from the dictionary to the updated field.
            updateRow[3] = valueDict[keyValue][0]
            updateRows.updateRow(updateRow)

del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))



‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Replacing Successive Joins with Simultaneous Processing of Several Python Dictionaries Built using da SearchCursors

 

When several joins have to be created and removed to do data manipulation between tables, the speed gains can be increased even more, since you can create several dictionaries and then process all of the join relationships simultaneously during a single pass of the update cursor on the updated feature class.  All source tables should be read into dictionaries before  processing the update cursor to achieve this benefit.  The more joins that you can replace this way, the more your script speed will improve.

 

Below is an example where two feature classes that would normally be joined to the updated feature class are replaced by dictionaries and the processed simultaneously by a single UpdateCursor operation.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\SourceFeatureClass1"

sourceFieldsList = ["JoinField1", "ValueField", "NumberField1", "NumberField2" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict1 = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

sourceFC = r"C:\Path\SourceFeatureClass2"

sourceFieldsList = ["JoinField2", "CapitalizedField" ]

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict2 = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

updateFC = r"C:\Path\UpdateFeatureClass"


updateFieldsList = ["JoinField1", "UrlField1", "SumField", "JoinField2", 
"TitleCaseField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        keyValue1 = updateRow[0]
        # verify that the keyValue1 is in the First Dictionary
        if keyValue1 in valueDict1:
             # updateRow[1] is the URL field and valueDict1[keyValue1][0] is the value field
             updateRow[1] = r"
http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=
" + str(valueDict1[keyValue1][0])

             # updateRow[2] is the Sumfield and valueDict1[keyValue1][1] and valueDict1[keyValue1][2] are the NumberField1 and NumberField2 fields. 
             updateRow[2] = valueDict1[keyValue1][1] + valueDict1[keyValue1][2]
        keyValue2 = updateRow[3]
        # verify that the keyValue2 is in the Second Dictionary
        if keyValue2 in valueDict2:

             # updateRow[4] is the TitleCaseField and valueDict2[keyValue2][0] is the CapitalizedField values.

             updateRow[4

] = valueDict2[keyValue2

][0].title()

        updateRows.updateRow(updateRow)

del valueDict1
del valueDict2

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Using a Python Dictionary Built using a da SearchCursor to Replace a Summary Statistics Output Table

 

Another common need is to create a summary of a feature class for the unique values of a field or fields that cover multiple records.  The Python Dictionary can be used instead of a Summary Table output to accomplish this.  The benefit is that the output is stored in memory and is much faster to create than outputting an actual summary table.  Below is an example building a summary dictionary and using it to update the source table with sum values matching that field and create a percentage field in the source table for each record that built the summary.

 

from time import strftime

print( "Start script: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

sourceFC = r"C:\Path\UpdateFeatureClass"

sourceFieldsList = ["UniqueValuesField", "NumberField"]

# Build a summary dictionary from a da SearchCursor with unique key values of a field storing a list of the sum of that value and the record count.
valueDict = {}
with arcpy.da.SearchCursor(sourceFC, sourceFieldsList) as searchRows:
    for searchRow in searchRows:
        keyValue = searchRow[0]
        if not keyValue in valueDict:
             # assign a new keyValue entry to the dictionary storing a list of the first NumberField value and 1 for the first record counter value
            valueDict[keyValue] = [searchRow[1], 1]
        # Sum the last summary of NumberField value with the current record and increment the record count when keyvalue is already in the dictionary  
        else:
            valueDict[keyValue][0] += searchRow[1]

            valueDict[keyValue][1] += 1  


updateFC = r"C:\Path\UpdateFeatureClass"

updateFieldsList = ["UniqueValuesField", "NumberField", "PercentField", "NumberSumField", "MeanField"]

with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # divide the current record's NumberField value by the sum of the NumberField to get a percentage value

            updateRow[2] = updateRow[1] / valueDict[keyValue][0] 
            # transfer the sum of the NumberField stored under the keyValue from the dictionary.
            updateRow[3] = valueDict[keyValue][0]
            # divide the sum of the NumberField value by the record count of the NumberField to get a mean value

            updateRow[4] = valueDict[keyValue][0] / valueDict[keyValue][1]

            updateRows.updateRow(updateRow)


del valueDict

print( "Finished script: " + strftime("%Y-%m-%d %H:%M:%S"))


‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

Conclusion

 

Once the patterns are understood it is possible to do virtually everything that you can do with the Field Calculator, a Join or a Summary table using an update cursor or a python dictionary.   Using these coding techniques I have been able to rewrite several scripts that did bulk Data Manipulations so that rather than taking more than an hour each to process, they now only take 3 to 10 minutes to process.  The more of these operations that you can replace using code like this, the greater the savings in processing time you will achieve.  The performance gains can be huge making it possible to keep data up to date without extensive waits or consuming large portions of your after hours batch processing time.  It also solves the problems that arise when you cannot alter the source data schema to add indexes or field concatenations and making copies of the data is not practical.

 

if you find any errors in the sample code, please let me know and I will post an update to this Blog to point out the correction. Also if you need more comments added to the code to help you understand what is going on, let me know and I will try to comment the code more fully.  These code examples are designed to work on shapefiles, file geodatabase feature classes, and unversioned SDE feature classes.  Additional code modifications are required to make them work with versioned SDE feature classes.

 

I want to express my thanks to Chris Synder who first demonstrated the usefulness of these techniques to me through the Forum and helped me understand how to apply them.

91 Comments
RichardFairhurst
MVP Alum

@LydiaRill I offer this blog and the transformations it can bring anyone's work flow completely free of charge.  Our time is the most valuable thing we have and I'm just glad you took the time to share with me Your gratitude.  With the time this has Blog has helped you regain, please share it with others that may benefit.

If you ever go to the Esri UC conference (when it's no longer just virtual) it would be great to meet you.  I'd love hear more about what this Blog has helped you to do and perhaps share more of what I've learned from the numerous ways I've used this technique in the years since I wrote this Blog.  The conference is always such a great place for all of us to draw inspiration from others who love GIS.

JoshZeeb
Occasional Contributor

Thank you so much for this resource! It's a huge help.

Though I'm currently dealing with an error when attempting the "Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets" code while using python 3.6 as my interpreter.

On the code line below I get the following error:

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

TypeError: 'Row' object is not subscriptable

After some research, I've come to the belief that indexing is not supported with the SearchCursor tool when using python 3.6 or ArcGIS Pro.

Is this true?? Are all my hopes and dreams crushed?

Edit: I found out the issue. I was using arcpy.SearchCursor instead of arcpy.da.SearchCursor...It seems to be working well now.

Mark_Hébert
Occasional Contributor

My observation: I plan to use Example #2 to update feature class records where parcel ownership has changed. New data records match field for field with a common, unchanging primary key.

I am surprised that since you published this blog back in 2014, ArcGIS (Pro) has not provide a comparable expedited analysis to update records from a source to a production feature class.

Did I miss the memo, or is your method still the gold standard for this kind of data operation?

I ask before I dive too deep into Example #2.

Thanks for your consideration.

Mark

RichardFairhurst
MVP Alum

The code works in Python 2.7 32 bit, which ArcMap uses, and in Python 3.X 64 bit which ArcGIS Pro uses.  The ArcGIS Pro Python requires that you maintain an active connection to ArcGIS Online through ArcGIS Pro to work, so whether or not you normally use ArcGIS Pro you have to periodically open it to maintain the AGOL connection for Python.

The Python version you use makes no difference in how you code the dictionary Data Transfer portion of the script.  You do have to enclose in parenthesis the outputs of all the print statements in Python 3.X to avoid an error.  The advantage of using Python 3.X for this code is that it supports access to more than 2 GB of RAM and is much less likely to fail due to an out of memory error when you work with very large datasets.

AnneSanta_Maria1
Deactivated User

@RichardFairhurst I don't know where you have been all my ESRI life but this post is the best thing!

I have a situation where the I want to modify "Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets", so that the it will run when the fields or columns are not a 1:1 match. Data for updating is coming from an excel table and will be used to update a feature service in Portal. Eventually, I want to create a tool for an end user to put in the excel table, the service to update, and set the join field, and potentially the fields to match for updating, but for now I am creating a notebook for the end user to enter information into. I am stuck on how to use a dictionary to update data in the feature service with nonmatching field names. I think I may be missing a critical step. I also wasn’t sure if I needed to use the field mapping tool from ESRI to match the fields or if the dictionary was sufficient. I may be completely off as well-my python lacks confidence.

import arcpy
import pandas as pd
arcpy.env.workspace = “myworkspace”
feature_class = “myservice”
input_table= “myexceltable”
table = pd.read_excel(input_table)
join_field_excel = "fieldA"
join_field_FC = "fieldA_differentname"
feature_class_fields =arcpy.ListFields(feature_class)
# Set the names of the Excel columns to update from
table_columns = list(table.columns)
#Create a dictionary mapping the Excel columns to the feature fields
column_mapping = {"excelfield1": "matchingservicefield", "excelfield2":"matchingservicefield2"…..}
###field_mappings= arcpy.FieldMappings()
#try:
 #   for excel_column, fc_field in column_mapping.items():
  #     field_map = arcpy.FieldMap()
 #      field_map.addInputField(feature_class, fc_field)
  #      field_map.addInputField(input_table, excel_column)
  #     field_mappings.addFieldMap(field_map)

#except Exception as e:
 #   print(arcpy.GetMessages())

try:
    with arcpy.da.UpdateCursor(feature_class,column_mapping.values()) as cursor:
        for row in cursor:
            id_value = row[join_field_FC]
            if id_value in table[join_field_excel].values:
                for excel_column,fc_field in column_mapping.items():
                    row[fc_field]=table.loc[table[join_field_excel]==id_value,table_column].values[0]
                cursor.updateRow(row)
except Exception as e:
    print(arcpy.GetMessages())

 

BlakeTerhune
MVP Regular Contributor

@AnneSanta_Maria1, you can use Richard's example #2 as-is without a field map. The update portion uses an index of the fields (the order) rather than the actual field names. So update the sourceFieldsList and updateFieldsList with your field names, but make sure they are in the same order. The rest of the code should be the same.

AnneSanta_Maria1
Deactivated User

Thank you! I was over thinking it. List position is all that matters 🙂

Kklitt
by
Occasional Contributor

@RichardFairhurst Wow, this is an incredible resource!

I was able to get my code working using example 2 of 'Using a Python Dictionary Built using a da SearchCursor to Replace a Join Connecting Two Feature Classes' and modify it slightly to iterate through an entire gdb.

Now I'm struggling with how I can get some type of export of keyValues that weren't found in the updateFC. I've tried the code below, and code line 7 gives me an export of keyValues found in the layer. Code lines 12 and 13 only exports the word 'None' instead of the actual keyValue number. Ideally, I'd like to get a message that includes the keyValue(s) not in the updateFC. It'd be a huge bonus if I could get a table of keyValues along with the rest of the attribute information from the fields not included in the updateFC as an export.

 

New code lines include 7, 12, and 13.

 

 

        with arcpy.da.UpdateCursor(updateFC, updateFieldsList) as updateRows:
            for updateRow in updateRows:
                # store the Join value of the row being updated in a keyValue variable
                keyValue = updateRow[0]
                # verify that the keyValue is in the Dictionary
                if keyValue in valueDict:
                    arcpy.AddMessage("{} in {}".format(keyValue, fc))
                    # transfer the values stored under the keyValue from the dictionary to the updated fields.
                    for n in range (1,len(sourceFieldsList)):
                        updateRow[n] = valueDict[keyValue][n-1]
                    updateRows.updateRow(updateRow)
                elif keyValue not in valueDict:
                    arcpy.AddMessage("{} not in {}".format(keyValue, fc))
        del valueDict

 

 

Example output of code line 7:

122565 in Demo\cr_line_demo_dataset
122545 in Demo\cr_line_demo_dataset
122663 in Demo\cr_line_demo_dataset
122793 in Demo\cr_line_demo_dataset
122749 in Demo\cr_line_demo_dataset
122805 in Demo\cr_line_demo_dataset
122565 in cr_line
122545 in cr_line
122559 in cr_line
122535 in cr_line
122537 in cr_line
111602 in cr_line

Example output of code line 12 and 13:

None not in Demo\cr_line_demo_dataset
None not in Demo\cr_line_demo_dataset
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_line
None not in cr_point
None not in cr_point

Thank you for any insights!

RichardFairhurst
MVP Alum

Without knowing your data or what field is being referenced in the field list  it is hard to determine the cause of this behavior.  Can you confirm that the field in your field list for the update cursor is never Null?  If any record contains a Null value in the first field in your list, the behavior is what I would expect.  Potentially, you need to add the ObjectID field to your update cursor field list and use a modified index expression for the update loop to avoid updating that field, and then only use that field index in your output for the condition of an unmatched record. 

In any case, without the full field list and a clear understanding of what values are possible in the update target key field you are referencing, I can only guess at what is causing this behavior.  However, there are ways to modify the code to accomplish what you want to do, but they may require a little more sophisticated field index handling in the code than the relatively straight forward examples I provided in my Blog.

I often build a new dictionary in the condition where records are unmatched and iterate through it after the update loop is finished. I use multiple loops frequently to process an update cursor, followed by looping through a dictionary of records for an insert cursor, followed by another loop of another dictionary with an update cursor that deletes records to acheive a complete update, adds and delete synchronization process of a target to match a source.  Each of these loops have to be kept separate to avoid causing locking conflicts.  I also include logic to not touch records that are validated as already synchronized to avoid the performance hit of doing unnecessary edits with my cursor processing, which is slower than reading and comparing a record in memory.

Rich

LindsayRaabe_FPCWA
MVP Regular Contributor

Hi @RichardFairhurst . I've been trying to implement some of your coding as it looks like just the ticket! I've hit a snag though and have no idea why. 

Below is my version of your coding that uses multiple join fields to then updated multiple target fields. The problem is, it only seems to be updating the first field (insp_2_weeks), and not any of the subsequent fields. Any idea why? The only real changes I made were decreasing it from 3 join fields to 2, updating the field names and input tables. 

 

# Update feature service

sourceFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

# Use list comprehension to build a dictionary from a da SearchCursor where the key values are based on 2 separate feilds
valueDict = {str(r[0]) + "," + str(r[1]):(r[2:]) for r in arcpy.da.SearchCursor(Table2, sourceFieldsList)}

updateFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 2 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1])
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the value stored under the keyValue from the dictionary to the updated field.
            updateRow[2] = valueDict[keyValue][0]
            print(updateRow[2])
            updateRows.updateRow(updateRow)

del valueDict

Additionally, is it possible to make it only update fields where the target field is null instead of just a mismatch in values? I don't mind if the target field has a different value from the source (as it may have been done by the field worker manually), but if it's null, then I want a default value entered to show an inspection occurred. 

 

RichardFairhurst
MVP Alum

You should have based your code on the section entitled:

Example 2 - Transfer of Multiple Field Values between Feature Classes where there is a 1:1 Match between Field Sets

The section for updating fields should look like the code below to update all matched fields in the records where the value is Null.  However, you should test this on a copy of your data before aplying it on any production data to verify that I aligned the field updates correctly with your other code modifications:

        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            # transfer the values stored under the keyValue from the dictionary to the updated fields.
            for n in range (2,len(sourceFieldsList)):
if valueDict[keyValue][n-2] == None:
updateRow[n] = 0
else:
updateRow[n] = valueDict[keyValue][n-2]
updateRows.updateRow(updateRow)

This code assumes you want to assign a default value of 0 if the field is Null and otherwise transfer the values of the source to the target field.  I have made several revisions to the code since my initial response after considering your requirements more carefully.  Hopefully, I have understood them correctly.

LindsayRaabe_FPCWA
MVP Regular Contributor

@RichardFairhurst Thank you - will test it out now. Definitely using a test dataset!

RichardFairhurst
MVP Alum

I am editing using my phone, so I just had to correct some code errors that occurred due to the difficulty of navigating in that editing environment.  Please recheck my post's code.

LindsayRaabe_FPCWA
MVP Regular Contributor

@RichardFairhurst syntax error in the updated code. 

LindsayRaabe_FPCWA_0-1740718814806.png

I was using the multi field join because I discovered that using the single field join was changing the GlobalID values for some reason. 

RichardFairhurst
MVP Alum

There was another error in the code caused by editing on my phone that I corrected on the last line, which should have been two separate lines of code as now shown.  Also, you should be prepared to adjust and correct any indentation errors on your own in your IDE, since they can sometime be difficult to catch in the code window of the community.  The IDE was highlighting the character where the line return and indentation error occurred in your screenshot.

LindsayRaabe_FPCWA
MVP Regular Contributor

Thank you for the update. I was looking at it trying to figure out the missing bit - didn't figure it was meant to be on the new line. That seems to have worked though! All features have been updated with either 0 or the value from the update table. 

One step closer. Now to figure out how to ignore the nulls altogether (leave them null) and only update values where new doesn't exist in old or doesn't match. 

RichardFairhurst
MVP Alum

The process you are describing is not something I cover in this blog.  It requires an insertCuror for new records and creation of multiple dictionaries built in multiple passes to process each cursor correctly (including a deleted record dictionary if you want a true match).  The updateRows.updateRow(updateRow) line also has to only occur at an indentation level that only occurs where the if condition detecting changes occurs is met and not at a level that occurs for every row.  I will see if I ever posted anything elsewhere that comes closer to that approach.

If you post the code you have working I could copy it and modify it to get closer to what you are describing.  It is too difficult to create a post with sample code on my phone without that as a starting point.

For the update part this should work to ignore nulls and only update records that do not match.

        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            if list(updateRow[2:]) != list(valueDict[keyValue]):
# transfer the values stored under the keyValue from the dictionary to the updated fields. for n in range (2,len(sourceFieldsList)):
if valueDict[keyValue][n-2] != None:
updateRow[n] = valueDict[keyValue][n-2]
updateRows.updateRow(updateRow)

I am not sure if the comparison line works correctly in converting the tuples to lists, so let me know if it triggers an error so that some print statements can be added to make the comparison correctly. 

LindsayRaabe_FPCWA
MVP Regular Contributor

Actually, I made a small change (added in the "pass" instead of converting nulls to 0) and it works perfectly (or perfect enough - not overly fussed about the editor tracking on this feature service being overwritten for every feature regardless of no changes happening for some of them). 

            # transfer the values stored under the keyValue from the dictionary to the updated fields.
            for n in range (2,len(sourceFieldsList)):
                if valueDict[keyValue][n-2] == None:
                    pass
                else: 
                    updateRow[n] = valueDict[keyValue][n-2]
            updateRows.updateRow(updateRow)

 Where a value exists and doesn't match, or the source field is null but a join value has been calculated, it gets updated.

If a value already exists and the join table is null, it is skipped. If both tables contain nulls, it is also skipped. (these are the ones where it would be nice for editor tracking to be left alone, but alas - I'm just happy it works!)

RichardFairhurst
MVP Alum

Try indenting all of the code you just posted under this if clause to only update rows that actually are different. 

if list(updateRow[2:]) != list(valueDict[keyValue]):

If it triggers errors, post them. 

LindsayRaabe_FPCWA
MVP Regular Contributor

Half way there! It left rows alone that were not present in the update table at all, but rows which were present but had no updates still had the edit date updated. 

RichardFairhurst
MVP Alum

Print both halves of the if condition on a record that was updated that you belive should not have updated on separate lines.  The difference could be that the lists that are being compared are misaligned, which wouldn't necessarily cause a runtime error, but it would be a coding logic error.

Also post your full code so far as a base for me to edit.  I cannot peice together the code on my phone without retyping everything otherwise.  I need the full code to add the code sections and logic for processing insert and delete cursors.  In any case, I am done for the night.

LindsayRaabe_FPCWA
MVP Regular Contributor

I inserted the print statements at Lines 7 & 8 below and caught the data for a record that should NOT be updated. Row A is the target table, row B is the source table.

If Row B contains values where Row A is None or a differing value, then it would update Row A.

In this case, Row B contains None values in some fields (those should be ignored outright) and the values it does contain already match those in Row A (so can also be ignored). 

with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 2 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1])
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            print(f"A: {list(updateRow[2:])}")
            print(f"B: {list(valueDict[keyValue])}")
            if list(updateRow[2:]) != list(valueDict[keyValue]):
A: ['Done previously on iForms', 'Done previously on iForms', 'Done previously on iForms', 'Done previously on iForms', 'Done previously on iForms', 'Done previously on iForms', 'john.doe on 2024-12-11', 'john.doe on 2025-01-24', None, None, None, None, None, None, None]
B: [None, None, None, None, None, None, 'john.doe on 2024-12-11', 'john.doe on 2025-01-24', None, None, None, None, None, None, None]

Here's my code in full - a lot of it is data prep with the stuff we've been discussing at the end (from Line 197). 

import arcpy
from datetime import datetime
from arcgis.gis import GIS
import certifi
import urllib3
import ssl
import warnings
from urllib3.exceptions import InsecureRequestWarning

# Create a default SSL context with certificate verification
ssl_context = ssl.create_default_context(cafile=certifi.where())
http = urllib3.PoolManager(ssl_context=ssl_context)

# Make a request to verify the setup
response = http.request('GET', 'https://maps.arcgis.com')
print("http response: " + str(response.status))

# Suppress only the single InsecureRequestWarning from urllib3 if necessary
warnings.simplefilter('ignore', InsecureRequestWarning)

# Create GIS object
print("Connecting to AGOL")
client_id = '##########'
client_secret = '##################'

gis = GIS("https://maps.arcgis.com", client_id=client_id, client_secret=client_secret)
print("Logged in as: " + gis.properties.user.username)

arcpy.env.overwriteOutput = True
arcpy.env.preserveGlobalIds=True

fs1 = "https://services-ap1.arcgis.com/###########/arcgis/rest/services/service_df503b7953e7471a975eb99994681d72/FeatureServer/0"
fs2 = "https://services-ap1.arcgis.com/###############/arcgis/rest/services/Post_Planting_Inspection_Tracking___TEST/FeatureServer/0"

# Update the "username" field with the "Creator" field value if "username" is NULL
with arcpy.da.UpdateCursor(fs1, ["username", "Creator"]) as cursor:
    for row in cursor:
        if row[0] is None or row[0] == "":
            row[0] = row[1]
            cursor.updateRow(row)

print("Username field updated successfully.")

with arcpy.EnvManager(preserveGlobalIds=True):
    export_table = arcpy.conversion.ExportTable(
        in_table=fs1,
        out_table=r"C:\Temp\scratch.gdb\FPC416_survey_ExportTable",
        where_clause="inspection_number <> 'Ad-hoc Inspection'",
        use_field_alias_as_name="NOT_USE_ALIAS",
        field_mapping=r'globalid "GlobalID" false false true 38 GlobalID 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,globalid,-1,-1;username "USERNAME" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,username,0,254;date_of_inspection "DATE OF INSPECTION" true true false 255 Date 0 1,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,date_of_inspection,-1,-1;plantation "PLANTATION" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,plantation,0,254;tree_species "TREE SPECIES" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,tree_species,0,254;tenure "TENURE" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,tenure,0,254;years_planted "YEAR/S PLANTED" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,years_planted,0,254;inspection_number "INSPECTION NUMBER" true true false 255 Text 0 0,First,#,FPC416 - Plantation & Property Inspection Form\FPC416_survey,inspection_number,0,254',
        sort_field=None
    )

# Delete fields
arcpy.management.DeleteField(
    in_table=export_table,
    drop_field="tree_species;tenure;inspection_type;fbs_type;fbs_respons;fbs_respons_other;fbs_adeq;fbs_trafcbl;fbs_issues;fbs_issues_other;fbs_action_tkn;fbs_action_tkn_other;fbs_fup_reqd;wtrpts_accessbl;wtrpts_sign_adeq;wtrpts_dry_summer;wtrpts_issues;wtrpts_issues_other;wtrpts_action_tkn;wtrpts_action_tkn_other;wtrpts_fup_reqd;pwrlne_clr_accept;pwrlne_clr_respons;pwrlne_clr_respons_other;pwrlne_clr_issues;pwrlne_clr_issues_other;pwrlne_clr_action_tkn;pwrlne_clr_action_tkn_other;pwrlne_clr_fup_reqd;nut_def_evident;nut_def_impact;nut_def_issues;nut_def_issues_other;nut_def_percent_aff;nut_def_action_tkn;nut_def_action_tkn_other;nut_def_fup_reqd;drought_evident;drought_impact;drought_issues;drought_issues_other;drought_percent_aff;drought_action_tkn;drought_action_tkn_other;drought_fup_reqd;frost_evident;frost_impact;frost_issues;frost_issues_other;frost_percent_aff;frost_action_tkn;frost_action_tkn_other;frost_fup_reqd;insects_evident;insects_type;insects_type_other;insects_plant_part;insects_plant_part_other;insects_impact;insects_percent_aff;insects_action_tkn;insects_action_tkn_other;insects_fup_reqd;vert_pests_evident;vert_pests_type;vert_pests_type_other;vert_pests_plant_part;vert_pests_plant_part_other;vert_pests_impact;vert_pests_percent_aff;vert_pests_action_tkn;vert_pests_action_tkn_other;vert_pests_fup_reqd;diseases_evident;diseases_type;diseases_type_other;diseases_plant_part;diseases_plant_part_other;diseases_impact;diseases_percent_aff;diseases_myrtle_rust_new_event;diseases_action_tkn;diseases_action_tkn_other;diseases_fup_reqd;weeds_impct_gwth;dec_weeds_prsnt;weeds_type;weeds_type_other;weeds_impact;weeds_issues;weeds_issues_other;weeds_percent_aff;weeds_action_tkn;weeds_action_tkn_other;weeds_fup_reqd;remnt_veg_evident;remnt_veg_type;remnt_veg_type_other;wind_dmg_evident;wind_dmg_impact;wind_dmg_issues;wind_dmg_issues_other;wind_dmg_percent_aff;wind_dmg_action_tkn;wind_dmg_action_tkn_other;wind_dmg_fup_reqd;sun_scorch_evident;sun_scorch_impact;sun_scorch_issues;sun_scorch_issues_other;sun_scorch_percent_aff;sun_scorch_action_tkn;sun_scorch_action_tkn_other;sun_scorch_fup_reqd;inundtn_evident;inundtn_impact;inundtn_issues;inundtn_issues_other;inundtn_percent_aff;inundtn_action_tkn;inundtn_action_tkn_other;inundtn_fup_reqd;stem_dfrms_evident;stem_dfrms_impact;stem_dfrms_issues;stem_dfrms_issues_other;stem_dfrms_percent_aff;stem_dfrms_action_tkn;stem_dfrms_action_tkn_other;stem_dfrms_fup_reqd;erosion_evident;erosion_impact;erosion_issues;erosion_issues_other;erosion_percent_aff;erosion_action_tkn;erosion_action_tkn_other;erosion_fup_reqd;stkhldr_obs;detail_stkhldr_obs;tree_types;tree_types_other;height_m;height;basal_area_m2;basal_area;stems_ha;stems;cone_dvpt;host_gwth_assess;sndlwd_stkng;sndlwd_stkng_other;sndlwd_gwth_assess;flwrng_seed_assess;flwrng_seed_assess_other;sndlwd_actions_tkn;sndlwd_actions_tkn_other;sndlwd_fup_action_reqd;rdsigns_vsble;rdsigns_issues;rdsigns_issues_other;rdsigns_actions_tkn;rdsigns_actions_tkn_other;rdsigns_fup_action_reqd;prop_signs_vsbl;prop_signs_issues;prop_signs_issues_other;propsigns_actions_tkn;propsigns_actions_tkn_other;propsigns_fup_action_reqd;fences_adeq;fences_issues;fences_issues_other;fences_actions_tkn;fences_actions_tkn_other;fences_fup_action_reqd;padlocks_reqd;padlock_issues;padlocks_actions_tkn;padlocks_actions_tkn_other;padlocks_fup_action_reqd;fire_canisters_intact;fire_canister_maps_unspoiled_current;fire_canisters_fup_action_reqd;bldgs_adeq;bldgs_fireprf;bldgs_issues;bldgs_issues_other;bldgs_actions_tkn;bldgs_actions_tkn_other;bldgs_fup_action_reqd;utils_accesbl;utils_fireprf;utils_issues;utils_issues_other;utils_actions_tkn;utils_actions_tkn_other;utils_fup_action_reqd;why_inspctn_missed;why_inspctn_missed_other;compltd_survey_date;CreationDate;Creator;EditDate;Editor;comments",
    method="DELETE_FIELDS"
)

# Define the Inspection list
lookup_list = ['2 weeks', '4 weeks', '6 weeks', '8 weeks', '10 weeks', '12 weeks', '4 months', '5 months', '6 months', '9 months', '12 months', '15 months', '18 months', '21 months', '24 months']

# Define the lookup dictionary
lookup_dict = {
    '2 weeks': 'insp_2_weeks',
    '4 weeks': 'insp_4_weeks',
    '6 weeks': 'insp_6_weeks',
    '8 weeks': 'insp_8_weeks',
    '10 weeks': 'insp_10_weeks',
    '12 weeks': 'insp_12_weeks',
    '4 months': 'insp_4_months',
    '5 months': 'insp_5_months',
    '6 months': 'insp_6_months',
    '9 months': 'insp_9_months',
    '12 months': 'insp_12_months',
    '15 months': 'insp_15_months',
    '18 months': 'insp_18_months',
    '21 months': 'insp_21_months',
    '24 months': 'insp_24_months'
}
print("Lookup table created")

# Add text fields to the table for each value in the lookup dictionary
for field_name in lookup_dict.values():
    print(f"Creating {field_name} field")
    arcpy.AddField_management(export_table, field_name, "TEXT")

# Function to parse date with different formats
def parse_date(date_str):
    for fmt in ("%d/%m/%Y %I:%M:%S %p", "%d/%m/%Y", "%d/%m/%Y %I:%M:%S.%f %p", "%d/%m/%Y %I:%M:%S %p"):
        try:
            return datetime.strptime(date_str, fmt).strftime("%Y-%m-%d")
        except ValueError:
            pass
    raise ValueError(f"Date format for {date_str} is not recognized")

# Update the fields based on the inspection_number using CalculateField

for v in lookup_list:
    update_field = lookup_dict[v]
    update_view = arcpy.management.MakeTableView(
        in_table=export_table,
        out_view="TableView",
        where_clause=f"inspection_number = '{v}'",
        workspace=None,
        field_info=f"globalid globalid VISIBLE NONE;username username VISIBLE NONE;date_of_inspection date_of_inspection VISIBLE NONE;plantation plantation VISIBLE NONE;years_planted years_planted VISIBLE NONE;inspection_number inspection_number VISIBLE NONE;{update_field} {update_field} VISIBLE NONE"
    )
    expression = f"!username! + ' on ' + (datetime.strptime(!date_of_inspection!, '%d/%m/%Y %I:%M:%S %p') if '/' in !date_of_inspection! else datetime.strptime(!date_of_inspection!, '%d/%m/%Y')).strftime('%Y-%m-%d')"
    arcpy.management.CalculateField(update_view, update_field, expression, "PYTHON3")

print("Fields added and updated successfully.")

flat_table = arcpy.analysis.Statistics(
    in_table=export_table,
    out_table=r"C:\Temp\scratch.gdb\FlattenedInspectionTable",
    statistics_fields="insp_2_weeks FIRST;insp_4_weeks FIRST;insp_6_weeks FIRST;insp_8_weeks FIRST;insp_10_weeks FIRST;insp_12_weeks FIRST;insp_4_months FIRST;insp_5_months FIRST;insp_6_months FIRST;insp_9_months FIRST;insp_12_months FIRST;insp_15_months FIRST;insp_18_months FIRST;insp_21_months FIRST;insp_24_months FIRST",
    case_field="plantation;years_planted",
    concatenation_separator=""
)

print("Table flattened")

# List all fields in the table
flat_fields = arcpy.ListFields(flat_table)

# Iterate through the fields and rename if they start with "FIRST_"
for field in flat_fields:
    if field.name.startswith("FIRST_"):
        new_field_name = field.name.replace("FIRST_", "", 1)
        arcpy.AlterField_management(flat_table, field.name, new_field_name, new_field_name)

print("Fields renamed successfully in flattend table.")

arcpy.management.CalculateField(
    in_table=flat_table,
    field="join_field",
    expression='!plantation!+"_"+str(!years_planted!)',
    expression_type="PYTHON3",
    code_block="",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

print("Join field created in flattened table")

with arcpy.EnvManager(preserveGlobalIds=True):
    Table2 = arcpy.conversion.ExportTable(
        in_table=fs2,
        out_table=r"C:\temp\scratch.gdb\PostPlantingInspectionTracking_TargetTable",
        where_clause="",
        use_field_alias_as_name="NOT_USE_ALIAS",
        field_mapping='Plantation "Plantation" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,Plantation,0,49;PlantingYear "Planting Year" true true false 0 Long 0 0,First,#,Post Planting Inspection Tracking - TEST,PlantingYear,-1,-1;insp_2_weeks "2 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_2_weeks,0,49;insp_4_weeks "4 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_4_weeks,0,49;insp_6_weeks "6 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_6_weeks,0,49;insp_8_weeks "8 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_8_weeks,0,49;insp_10_weeks "10 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_10_weeks,0,49;insp_12_weeks "12 week inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_12_weeks,0,49;insp_4_months "4 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_4_months,0,49;insp_5_months "5 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_5_months,0,49;insp_6_months "6 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_6_months,0,49;insp_9_months "9 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_9_months,0,49;insp_12_months "12 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_12_months,0,49;insp_15_months "15 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_15_months,0,49;insp_18_months "18 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_18_months,0,49;insp_21_months "21 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_21_months,0,49;insp_24_months "24 month inspection" true true false 50 Text 0 0,First,#,Post Planting Inspection Tracking - TEST,insp_24_months,0,49',
        sort_field=None
    )

print("Inspection tracking table exported")

arcpy.management.CalculateField(
    in_table=Table2,
    field="join_field",
    expression='!Plantation!+"_"+str(!PlantingYear!)',
    expression_type="PYTHON3",
    code_block="",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

print("Join field created in tracking table")

sourceFieldsList = ['join_field', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

# Use list comprehension to build a dictionary from a da SearchCursor
valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(flat_table, sourceFieldsList)}

updateFieldsList = ['join_field', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

changeCnt = 0

with arcpy.da.UpdateCursor(Table2, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value of the row being updated in a keyValue variable
        keyValue = updateRow[0]
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            if list(valueDict[keyValue]) and updateRow[1:15]:
                # A newer value exists
                changeCnt += 1
                for n in range (1,len(sourceFieldsList)):
                    updateRow[n] = valueDict[keyValue][n-1]
                updateRows.updateRow(updateRow)
            # transfer the values stored under the keyValue from the dictionary to the updated fields.
            
del valueDict
print("Temp data prepped")

# Update feature service

sourceFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

# Use list comprehension to build a dictionary from a da SearchCursor where the key values are based on 2 separate feilds
valueDict = {str(r[0]) + "," + str(r[1]):(r[2:]) for r in arcpy.da.SearchCursor(Table2, sourceFieldsList)}

updateFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 2 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1])
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            if list(updateRow[2:]) != list(valueDict[keyValue]):
                # transfer the values stored under the keyValue from the dictionary to the updated fields.
                for n in range (2,len(sourceFieldsList)):
                    if valueDict[keyValue][n-2] == None:
                        pass
                    else: 
                        updateRow[n] = valueDict[keyValue][n-2]
                updateRows.updateRow(updateRow)

del valueDict

print("Fields updated successfully.")

Interestingly, when I tried to run this from our server, it failed to run Line 35 because Pro wasn't signed in (doesn't want to use the login process at Line 26) but alas, that's an issue for a different post!

RichardFairhurst
MVP Alum
tempDict = {}
delDict = {}
with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
for updateRow in updateRows:
# store the Join value by combining 2 field values of the row being updated in a keyValue variable
keyValue = updateRow[0] + "," + str(updateRow[1])
tempDict[keyValue] = keyValue.split(",",1)
# verify that the keyValue is in the Dictionary
if keyValue in valueDict:
update = False
for n in range (2,len(sourceFieldsList)):
if valueDict[keyValue][n-2] != None:
if updateRow[n] == None:
updateRow[n] = valueDict[keyValue][n-2]
update = True
elif updateRow[n] != valueDict[keyValue][n-2]:
updateRow[n] = valueDict[keyValue][n-2]
update = True
if update == True:
updateRows.updateRow(updateRow)
else:
delDict[keyValue] = keyValue

with arcpy.da.InsertCursor(fs2, updateFieldsList) as cursor:
for key in valueDict.keys():
if not key in tempDict:
insRow = key.split(",",1) + list(valueDict[key])
print(insRow)
cursor.insertRow(insRow)

with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
for updateRow in updateRows:
keyValue = updateRow[0] + "," + str(updateRow[1])
if keyValue in delDict:
updateRows.deleteRow()

del delDict
del tempDict
del valueDict

The above code should only update row A if the row B field is not Null and one or more field in row A is Null or has a different value from the row B field.  It should not do an update if for all fields the row B field is Null, both the row A field and row B field are Null, or the row A field and row B field contain the same value.

This code also should insert rows from the source that are missing in the update target and delete rows from the update target that are not in the source.  The insert and delete operations are done in separate for loops from the initial updatecursor to prevent locks and to avoid having the updateCursor get confused by rows being deleted.

It is important to note that InsertCursors require that the updateFieldsList has the field names and field values arranged in the exact same order as the actual field order of the underlying table/feature class/service, otherwise the insert will fail.  SearchCursors and UpdateCursors do not have this requirement and can process fields in any order, but InsertCursors can't rearrange the underlying field order.

LindsayRaabe_FPCWA
MVP Regular Contributor

Thanks for trying to figure this out for me @RichardFairhurst. I tested your new code and it seemed to update all found rows with the same value. I did remove the Insert and Delete sections as that wasn't required. 

I just experiemented with the code I provided above that was working and included a bit extra to disable and renenable editor tracking on the feature service around the Update section. This worked well and is sufficient for my needs. 

Thanks again for your efforts (especially on your phone!). 

Updated code below showing editor tracking disable/enable. 

del valueDict
print("Temp data prepped")

# Update feature service

def remove_last_part(url):
    # Split the URL by '/' and join all parts except the last one
    return '/'.join(url.split('/')[:-1])

# Remove the last part of the URL
modified_url = remove_last_part(fs2)

# Access the feature layer collection
flc = FeatureLayerCollection(modified_url, gis)

# Disable editor tracking
disable_params = {
    "editorTrackingInfo": {
        "enableEditorTracking": False
    }
}
flc.manager.update_definition(disable_params)

sourceFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

# Use list comprehension to build a dictionary from a da SearchCursor where the key values are based on 2 separate feilds
valueDict = {str(r[0]) + "," + str(r[1]):(r[2:]) for r in arcpy.da.SearchCursor(Table2, sourceFieldsList)}

updateFieldsList = ['Plantation', 'PlantingYear', 'insp_2_weeks', 'insp_4_weeks', 'insp_6_weeks', 'insp_8_weeks', 'insp_10_weeks', 'insp_12_weeks', 'insp_4_months', 'insp_5_months', 'insp_6_months', 'insp_9_months', 'insp_12_months', 'insp_15_months', 'insp_18_months', 'insp_21_months', 'insp_24_months']

with arcpy.da.UpdateCursor(fs2, updateFieldsList) as updateRows:
    for updateRow in updateRows:
        # store the Join value by combining 2 field values of the row being updated in a keyValue variable
        keyValue = updateRow[0]+ "," + str(updateRow[1])
        # verify that the keyValue is in the Dictionary
        if keyValue in valueDict:
            if list(updateRow[2:]) != list(valueDict[keyValue]):
                # transfer the values stored under the keyValue from the dictionary to the updated fields.
                for n in range (2,len(sourceFieldsList)):
                    if valueDict[keyValue][n-2] == None:
                        pass
                    else: 
                        updateRow[n] = valueDict[keyValue][n-2]
                updateRows.updateRow(updateRow)

del valueDict

# Enable editor tracking
enable_params = {
    "editorTrackingInfo": {
        "enableEditorTracking": True
    }
}
flc.manager.update_definition(enable_params)

print("Fields updated successfully.")
RichardFairhurst
MVP Alum

I found the error in my previous code.  I had not noticed that I had pasted the for loop twice when it should only have been pasted once:

  for updateRow in updateRows

I have removed the second for loop and it will now work according to your needs.  This revised code is many times more efficient than your current code, especially if only a few records actually need to be updated.  The larger your record set becomes, the larger the benefit of using my revised code, since the act of updating a row is the most costly part of the loop, and avoiding it when no change is actually occurring is the most significant optimization you can implement.  You also won't have to disable editor tracking with the updated code, since records that have no change will not be updated and not trigger the modified date.

I left the insert and delete code for the benefit of others.  The complete synchronization of a relationship to produce a complete match efficiently from a source requires all three steps.

LindsayRaabe_FPCWA
MVP Regular Contributor

Nice work. I've implemented and tested the revised code and can confirm that it only updates fields that have changed and editor tracking info is only updated for those specific features. 

It's great having these full code snippets to work with too. Completely understand and appreciate having the extra functionality in the Insert/Delete available. I'm sure I will need it myself at some point!

LindsayRaabe_FPCWA
MVP Regular Contributor

Hi @RichardFairhurst. I'm having a go at implementing this in another application. This time, seemingly a lot simpler, but I must still be missing something. I'm only trying to check and update an area field (AreaHa) and want to use the GlobalID as the join field. Below is the code I'm currently working with and the print statements. I'm guessing it isn't happy because the GlobalID's don't match case, but I can't seem to force this to be consistent. Also tried to store the original ObjectID's in a temp field to use and couldn't get that to work either (couldn't get the field to export). 

    # Create a temporary directory
    temp_dir = tempfile.mkdtemp()
    # Define the path for the temporary File Geodatabase
    temp_gdb = os.path.join(temp_dir, "tempAreaHa.gdb")
    # Create the File Geodatabase
    arcpy.CreateFileGDB_management(temp_dir, "tempAreaHa.gdb")
    # Set the workspace to the temporary File Geodatabase
    arcpy.env.workspace = temp_gdb

    def remove_last_part(url):
        # Split the URL by '/' and join all parts except the last one
        return '/'.join(url.split('/')[:-1])

    def extract_fc_name(url):
        # Extract the feature class name from the URL and clean it
        path_parts = url.split('/')
        fc_name = path_parts[-3]
        return fc_name

    # Read the CSV file containing feature service URLs
    # Example service url value in csv: https://services-ap1.arcgis.com/#############/arcgis/rest/services/Plantation_Concept_Areas_Test/FeatureServer/0
    with open('FeatureServiceAreaUpdates.csv', mode='r') as file:
        csv_reader = csv.reader(file)
        next(csv_reader)  # Skip header row if present

        for row in csv_reader:
            featureservice = row[0]
            fc_name = extract_fc_name(featureservice)

            with arcpy.EnvManager(preserveGlobalIds=True):
                temp_Areas = arcpy.conversion.ExportFeatures(
                    in_features=featureservice,
                    out_features=arcpy.env.workspace + "\\" + fc_name,
                    where_clause="",
                    use_field_alias_as_name="NOT_USE_ALIAS",
                    field_mapping=f'AreaHa "Area (ha)" true true false 0 Double 0 0,First,#,{fc_name},AreaHa,-1,-1;GlobalID "GlobalID" false false true 38 GlobalID 0 0,First,#,{fc_name},GlobalID,-1,-1',
                    sort_field=None
                )
            print(f"Features exported for {fc_name}")

            arcpy.management.CalculateGeometryAttributes(
                in_features=temp_Areas,
                geometry_property="AreaHa AREA",
                length_unit="",
                area_unit="HECTARES",
                coordinate_system='PROJCS["GDA_1994_MGA_Zone_50",GEOGCS["GCS_GDA_1994",DATUM["D_GDA_1994",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",117.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]',
                coordinate_format="SAME_AS_INPUT"
            )
            print(f"Areas calculated for {fc_name}")

            # Remove the last part of the URL
            modified_url = remove_last_part(featureservice)

            # Access the feature layer collection
            flc = FeatureLayerCollection(modified_url, gis)

            # Disable editor tracking
            disable_params = {
                "editorTrackingInfo": {
                    "enableEditorTracking": False
                }
            }
            flc.manager.update_definition(disable_params)
            print(f"Editor tracking disabled for {fc_name}")

            # Create a dictionary from the exported table
            sourceFieldsList = ['GlobalID', 'AreaHa']
            valueDict = {r[0]: r[1] for r in arcpy.da.SearchCursor(temp_Areas, sourceFieldsList)}
            print(valueDict)

            # Update the feature service using the dictionary
            updateFieldsList = ['GlobalID', 'AreaHa'] 
            with arcpy.da.UpdateCursor(featureservice, updateFieldsList) as updateRows:
                for updateRow in updateRows:
                    print(updateRow)
                    # store the Join value of the row being updated in a keyValue variable
                    keyValue = updateRow[0]
                    # verify that the keyValue is in the Dictionary
                    if keyValue in valueDict:
                        print(keyValue)
                        if list(valueDict[keyValue]) and updateRow[1:1]:
                            # A newer value exists
                            changeCnt += 1
                            for n in range (1,len(sourceFieldsList)):
                                print(n)
                                updateRow[n] = valueDict[keyValue][n-1]
                            updateRows.updateRow(updateRow)
                        # transfer the values stored under the keyValue from the dictionary to the updated fields.

            del valueDict
            print(f"AreaHa field updated successfully for {fc_name}")

            # Re-enable editor tracking
            enable_params = {
                "editorTrackingInfo": {
                    "enableEditorTracking": True,
                    "enableOwnershipAccessControl": True,
                    "allowOthersToUpdate": True,
                    "allowOthersToDelete": True
                }
            }
            flc.manager.update_definition(enable_params)
            print(f"Editor tracking re-enabled for {fc_name}")

            endTime = time.time()
            print(f"Script completed in {endTime - startTime} seconds.")

LindsayRaabe_FPCWA_0-1741824801673.png

 

RichardFairhurst
MVP Alum

You can change the case to all upper or all lower in both the dictionary and the cursor to force a match even if they are stored in different letter cases on disk.  This modification is similar to your previous code where you made the key the combination of two field values that are stored as two separate field values on disk.

valueDict = {r[0].upper(): r[1] for r in arcpy.da.SearchCursor(temp_Areas, sourceFieldsList)}

            print(valueDict)

keyValue = updateRow[0].upper()

So one of the key benefits of this code is that tables that don't work using a standard attribute join can be matched and at a faster speed than a Join and calculation (which would also update all rows rather than just the rows that have actual differences.)

LindsayRaabe_FPCWA
MVP Regular Contributor

Getting Exception: name 'upper' is not defined for the keyValue = upper(updateRow[0]) line. I've tried applying it this way as well, but still not working despite not throwing an exception: 

keyValue = (updateRow[0].upper)

LindsayRaabe_FPCWA_0-1741836053081.png

 

LindsayRaabe_FPCWA
MVP Regular Contributor

I have no idea what is going on here, but it seems that something is not playing nice between ArcGIS Online and ArcGIS Pro/arcpy. In AGOL, the GlobalID is all lowercase:

LindsayRaabe_FPCWA_0-1741836564197.png

but the same feature in ArcGIS Pro is all uppercase:

LindsayRaabe_FPCWA_1-1741836582310.png

So I have no idea what to do now. Found this thread, but seems it's still not resolved. I wonder if this is what is causing my issues. 

RichardFairhurst
MVP Alum

I got the upper syntax wrong.  It is fixed in my previous post now.  The syntax is supposed to be:

text.upper()

You missed the parentheses at the end.

I don't know why the case is changing in the two environments.  I don't work with portal data very much, so someone with more experience will have to let you know if that behavior is normal or not.

 

Also this line of code makes no sense to me:

if list(valueDict[keyValue]) and updateRow[1:1]:

those are not logical true false test conditions.  What is this line supposed to do?

Rich

LindsayRaabe_FPCWA
MVP Regular Contributor

Woohoo! Progress. Thank you. Now on to the next bit... float object is not iterable....

LindsayRaabe_FPCWA_0-1741837763833.png

I've put in some print statements to debug - looks like this is the problem line: list(valueDict[keyValue])

 

 

 

 

RichardFairhurst
MVP Alum

list() converts an interable object like a tuple to a list.  The valueDict[keyValue] evaluates as a float, which cannot be converted to a list, it can only be added or appended to an existing list.  Also, since the valueDict[keyValue] is always a single float value, there is no benefit to enclosing it in a list over working with it directly as a float value unless you were going to append multiple float values to the list.  This is how you would create a list variable that contains  your float value:

myList = (valueDict[keyValue])

You need to explain what you believe you are trying to accomplish by putting a float value into a list and why you think that is required for your code to accomplish it's purpose.  I don't see any purpose for why you are doing what you are doing or have any idea what your end goal is with this portion of your code.

LindsayRaabe_FPCWA
MVP Regular Contributor

We're working beyond my knowledge base here, thus my inept adaptations of your original code. At that point, it should just be getting the keyValue that matches the updateRow and updating the associated Area value. 

RichardFairhurst
MVP Alum

If you only want area values that have changed to be updated you should use the if condition below:

if valueDict[keyValue] != updateRow[1]:

Potentially you would need an additional prior if clause to do an update to handle Null values.  Something like:

if valueDict[keyValue] != None and updateRow[1] == None:

Also you need to change the update assignment line from:

updateRow[n] = valueDict[keyValue][n-1]

to:

updateRow[n] = valueDict[keyValue]

This change is needed since your dictionary values returned are floats and not a tuple or list that requires the use of an index to retrieve a value from within it.  The code could be rewritten to be simplified further since your dictionary is not returning a list, but if you just make this change the code would work without an error.

If that does not accomplish your goal, please try to explain the final result you would like to acheive in more detail.

LindsayRaabe_FPCWA
MVP Regular Contributor

You got! I implemented the 3 changes you suggested, and it worked for rows that had null area values, but not incorrect. I then removed the 2nd suggestion (if valueDict[keyValue] != None and updateRow[1] == None:) and it worked for all missing and incorrect values.

It also only "edited" the features requiring an update an nothing else (though I've now implemented disabling and enabling of editor tracking in the script so this is less important from an editor tracking perspective, but I'm sure is still a huge performance boon). 

            print("Updating data")
            with arcpy.da.UpdateCursor(featureservice, updateFieldsList) as updateRows:
                for updateRow in updateRows:
                    print(updateRow)
                    # store the Join value of the row being updated in a keyValue variable
                    keyValue = updateRow[0].upper()
                    # verify that the keyValue is in the Dictionary
                    if keyValue in valueDict:
                        print(f"KeyValue found: {keyValue}")
                    
                        if valueDict[keyValue] != updateRow[1]:
                            print("Yep")
                            # A newer value exists
                            for n in range (1,len(sourceFieldsList)):
                                print(n)
                                updateRow[n] = valueDict[keyValue]
                            updateRows.updateRow(updateRow)
                        # transfer the values stored under the keyValue from the dictionary to the updated fields.

            del valueDict
            print(f"AreaHa field updated successfully for {fc_name}")

 Thank you again for your assistance. It's been invaluable. 

AnninaRupe1
Frequent Contributor

@RichardFairhurst : Thank you for such a great resource. I've been able to implement a solution using "Example 3 - Transfer of Multiple Field Values between Feature Classes involving Complex Field Manipulations" and am trying to understand a part of it:

In line 23, updateRow[1] refers to the URL field ("UrlField1") of the update feature class and valueDict[keyValue][0] is the value field ("ValueField") of the source feature class. I'm wondering why the valueDict[keyValue] is [0] and not [1]? In the sourceFieldsList, "ValueField" is in the second position and since Python starts from 0, shouldn't that value be [1]? The "UrlField1" is in the second position in the updateFieldsList, so I expect it to be [1]. 

Obviously your code works, I'm just trying to learn! I'm sharing this with my colleagues because it's been so helpful!

BlakeTerhune
MVP Regular Contributor

@AnninaRupe1, I think this mismatch is due to the restructuring of the source data when creating valueDict on line 12. There, you can see that the key (r[0]) is the first field ("JoinField") and then the value is a list (converted to a tuple) of all the other fields, starting with "ValueField." So when you get to line 23 to start looking at valueDict to get what's in the "ValueField", it will be in the first position (index 0) of the tuple.

In other words, valueDict[keyValue] will return a tuple like ("ValueField", "NumberField1", "NumberField2", "CapitalizedField")
So valueDict[keyValue][0] will give you "ValueField"

It is a little confusing because things get misaligned by creating the valueDict with the fields split up. If you really wanted, you could generate valueDict and repeat the "JoinField" in the tuple so the indexes match the updateFieldsList.

valueDict = {r[0]:(r) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}
 
Then you would have to shift all the indexes on valueDict up by 1 (lines 23 - 29). The new line 24 would be:
updateRow[1] = r"http://pic.asrclkrec.com/view_ParcelMaps.aspx?ParcelNumber=" + str(valueDict[keyValue][1])
RichardFairhurst
MVP Alum

@AnninaRupe1 Thank you for the question and trying to understand the concepts taught in this blog better.

Constructing a dictionary from a list can be done in many ways, depending on how you want to process the list data in you code.  In the case of my code I will try to break the dictionary construction down more and explain why I chose to do it the way I did.  Here is the code from example 3 that built the dictionary.

valueDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(sourceFC, sourceFieldsList)}

The dictionary key value is the field value stored under the first field name in the field list: r[0]. 

The dictionary value contained under that particular key is a tuple () containing the set of all of the field values from the field name list r except the first field name in the field names list stored at index 0 of the field name list, since it is not the entire field list and instead starts at index 1 of the field names list: (r[1:]).  

The reason for this is that I only use the first field as a matching key value, and not as a data transfer value in the rest of my code.  I saw no need to include code to transfer and overwrite the key into the field in the matched record in the target data that by definition must already contain the matched key value before a transfer can take place.

valueDict[keyValue] returns the tuple associated with the key that only contains values that actually need to be written to the matched record.  The consequence of my choice to exclude they key value from the tuple is that the tuple indexes are shifted left or -1 relative to the original field list.  Not overwriting the key value in the target record makes the code faster than an unnecessary data overwrite, since writing data is the slowest part of the code and you want to avoid unnecessary data writes that you possibly can in your code.

The data transfer part of the code makes the necessary index shift relative to the original field name list indexing to keep everything aligned.

There are alternative ways the code could have accomplished the same thing, but you would have to adjust the indexing shown in my example in both the dictionary construction and the data transfer part of the code for it to work and still avoid the unnecessary overwrite of the key value that my code intentionally avoids.

 

List manipulation is a topic unto itself and is largely driven by the goals of your code, which must consider and balance valid logic, speed, understandability, elegance and efficiency.  Choosing, understanding and applying at least one form of internally consistent and valid logic and code syntax that accomplishes your goal is always essential before you try to adjust the logic or syntax to optimize and improve the code for the other 4 code factors.  Hopefully, this post is useful in helping you better understand the choices driving the way I wrote my code.  If you believe you have discovered ways to improve my code relative to the other factors, feel free to offer working code examples here.

AnninaRupe1
Frequent Contributor

That totally makes sense now! Appreciate the explanation.

I did a quick search but couldn't find anything: do you do blog posts regularly? I'd be interested in reading those, if you can provide the location. Your explanations are so well constructed; I can understand the logic much better than any other Python source I've come across. 

RichardFairhurst
MVP Alum

@AnninaRupe1 Thank you for sharing your appreciation of my efforts to try to help people make the connection between the key Python concepts I'm describing and the handful of lines of sometimes densely compacted code I've present.  It also helps me to spell out why it was done that way, to see whether it really has stood up over time or could benefit from an update with things I have learned since I originally wrote this blog.

I can only recall one other Python Blog I wrote called:

I've Saved Time in a Bottle. How Do I Get it Back Out? - Doing More with Date Fields Using the Field Calculator and Python

 

The Turbo Charging Data Manipulation blog remains the single most useful coding concept I have repeatedly applied and adapted to solve problems I have encountered in my own career.  And I am grateful to see that it has stood the test of time and continues to be a resource for many other people.

 

Labels