Iterate selection and calculate field not working

ThomasColson · ‎09-22-2018

With the following, I am trying to select each polygon by its name, select points from another feature class that are within the selected polygon, then calculate a field using the same name that was used to select the polygon (by attributes). A spatial intersect, or any other method that results in creating a new FC as output and having to do joins won't work here, as this is (hopefully) to be an unattended nightly update script. It iterates through the polygons fine, but the result is that ALL points get the name of the last polygon.

import os
import errno
import arcpy
import sys  
import traceback  
from arcpy import env
update_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\GRSM_RESEARCH_LOCATIONS'
county_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\CountyBoundaries'

with arcpy.da.SearchCursor(county_feature_class,['SHAPE@','NAME']) as cursor:
    for row in cursor:
        print('Selecting '+row[1])
        expression = "NAME = '"+row[1]+"'"
        arcpy.SelectLayerByAttribute_management (county_feature_class, "NEW_SELECTION", expression)
        arcpy.SelectLayerByLocation_management(update_feature_class, "INTERSECT", county_feature_class)
        arcpy.CalculateField_management(update_feature_class, "COUNTY","'"+row[1]+"'", "PYTHON3")‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

MichaelVolz · ‎09-26-2018

If you use intersect to capture points on the border, wouldn't you then have to deal with 2 names from the 2 polygons that intersect the point that is on the border of each polygon?

DanPatterson_Retired · ‎09-26-2018

This gets more complicated all the time next it will have to be updated on a mobile device

MichaelVolz · ‎09-26-2018

Yep that's what happens when you have enterprise applications!!!

ThomasColson · ‎09-26-2018

It has to be synced with Carto.....just kidding!

ThomasColson · ‎10-13-2018

In response to https://community.esri.com/thread/221620-iterate-selection-and-calculate-field-not-working#comment-8... :

Now that you mention it, the updated values DO need to show up on mobile devices. These are what we call "administrative" boundaries. The field workers could care less what county, watershed, or jurisdiction something occurs in, but the managers can't function without it. They want to sit in their vehicle somewhere and call up "Show me all the sasquatch sightings in this county", or "how many 3 headed fish are in this watershed", or "how many pizza's did colson have delivered to this district". I've tried SQL triggers in the past, but the performance hit is too great, and asking data editors to attribute up to 10 administrative boundaries for each point will never happen. Hence a python solution where we can push a button and do it, or automate it.

RichardFairhurst · ‎09-26-2018

I removed that comment after examining the graphics for within and the description of intersect. It appears that within does include points that fall on the polygon boundary.

I did not deal with 2 names, only because your original code would not have done that, nor would Dan's. If you wanted a list when a point fell on the boundary shared by 2 or more counties the break would have to be removed and instead of assigning the name to the point in line 35 you would append values to a list and then update the point with a string concatenating them with a character separator and always update the point. The number of comparisons will grow exponentially as the number of polygons and points grows, whereas in my previous code I minimized the number of comparisons by stopping the county iteration after the first county was matched. In this version I also sorted the names in the dictionary into a list so I only have to sort the county names once. Then instead of iterating the dictionary I can iterate that list so that when 2 or more county names touch a point the county names will be in alphabetical order. The code was already eliminating duplicate county names, so only unique county names will be listed.

import arcpy

update_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\GRSM_RESEARCH_LOCATIONS'
county_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\CountyBoundaries'

county_count = 0
relateDict = {}
# Set up search cursor for county fc
with arcpy.da.SearchCursor(county_feature_class,['NAME','SHAPE@']) as search_cursor:

    # Iterate through counties
    for polygon in search_cursor:
        if not polygon[0] in relateDict:
            relateDict[polygon[0]] = polygon[1]
        else:
            relateDict[polygon[0]] = relateDict[polygon[0]].union(polygon[1])

county_names = []
for name in sorted(relateDict):
    county_names.append(name)
    county_count += 1

point_count = 0
comparison_count = 0
# Set up update cursor for point fc
with arcpy.da.UpdateCursor(update_feature_class, ['SHAPE@', 'COUNTY']) as update_cursor:

    # Iterate through points
    for point in update_cursor:
        point_count += 1
        name_list = []
        for name in county_names:
            comparison_count += 1
            county = relateDict[name]
            # Check if the current point is within the current polygon
            if point[0].within(county):  # Returns True or False
                # Add the county name to the list
                name_list.append(name)
        point[1] = ",".join(name_list)
        # Update the cursor with the new value
        update_cursor.updateRow(point)
print("Processed {0} County/Point comparisons for {1} Counties and {2} Points".format(comparison_count, county_count, point_count))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

RichardFairhurst · ‎09-28-2018

Just in case Ben deletes his post for the 4th time causing my responses to be deleted in the process I am reposting my analysis of his code outside of a response to his post. Ben's code uses a search cursor for loop on the counties and an inner update cursor for loop on the points and uses the reset method on the update cursor at the end of the inner loop to make sure his code iterates all of the feature combinations.

I ran Ben's code and compared it to the code I wrote. There are several things that are critical to the way Ben's code is structured. The cursor.reset operation is time consuming, but he has minimized it by making sure that the outer loop contains the fewest features and the inner loop contains the most features. If the loops are reversed and the outer cursor contains the majority of the features, the constant repetition of the cursor reset operation in the inner loop will cause the code to slow dramatically. This is because the cursor reset is accessing the data on disk each time the cursor is reset and not caching the data in memory.

On a test where I had 31 polygons and 5991 points, Ben's original code was written so that the cursor only needed to be created or reset 31 times. That code took about 7 minutes and 32 seconds to complete. However, when I reversed the loops and processed the same set of features the cursor had to be created or reset 5,991 times. Even though I added a break to the inner loop to reduce the overall number of feature comparisons being performed by about 40%, this version of the code took 37 minutes and 20 seconds to complete. The cursor reset operation clearly had a huge negative impact on the code's performance, so eliminating this operation altogether is preferable.

My code does just that. With my code the loops can be in either order without impacting performance, since all of the features are read only once from disk. My code completes in about 7 minutes regardless of the order of the loops.

Also, a break cannot be added to Ben's original code, since the optimal order of the loops does not allow for it. With my code the loops are reversed and I can break out of the inner county name loop when I get a name match. That reduced the time the code took to run from about 7 minutes to only 4 minutes (although the amount of time saved will vary depending on set of features being processed). Since none of my points fell on a polygon boundary this change was able to significantly reduce the processing time without having any impact on the output, and the structure of my code allows me to take advantage of it.

ThomasColson · ‎10-13-2018

Two questions regarding https://community.esri.com/thread/221620-iterate-selection-and-calculate-field-not-working#comment-8...

How could I throw a select statement in there to only update points where the name is null?
This is probably fantasy, but how could this be modified to do a nearest neighbor? E.g I want to append "ROAD_NAME" in the point feature class with the name of the nearest road.

RichardFairhurst · ‎10-14-2018

For the first question you can limit the update points to those where the COUNTY name is null simply by adding a where clause to the UpdateCursor in my code:

with arcpy.da.UpdateCursor(update_feature_class, ['SHAPE@', 'COUNTY'], "COUNTY IS NULL") as update_cursor:‍‍‍‍

The Generate Near Table tool algorithm designed by Esri is bound to be much more efficient than any process I could develop using the distanceTo (other) method of Python geometry, since that is a fairly time consuming geometry operation. So assuming you have an Advanced license it would be easiest to run the Generate Near Tool using the default CLOSEST option and then build an attribute based dictionary from the near table result and your road features to obtain the road name to populate the points. An attribute based dictionary is incredibly fast to build and then use to perform matching between two tables and is primarily how I normally use dictionaries.

I believe the following code should work once you modify the feature class paths and names and field names to match your actual data.:

from time import strftime  
  
print("Script Start: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy

update_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\GRSM_RESEARCH_LOCATIONS'
road_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\Roads'

‍‍‍‍‍arcpy.MakeFeatureLayer_management(update_feature_class,"update_lyr","ROAD_NAME IS NULL")
arcpy.MakeFeatureLayer_management(road_feature_class,"road_lyr")

print("Created Layers: " + strftime("%Y-%m-%d %H:%M:%S"))

out_table = "C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\points_closet_road"
if arcpy.Exists(out_table):
  arcpy.Delete_management(out_table)
  print("Deleted Near Table: " + strftime("%Y-%m-%d %H:%M:%S"))

arcpy.GenerateNearTable_analysis("update_lyr", "road_lyr", out_table)
  
print("Generated Near Table": " + strftime("%Y-%m-%d %H:%M:%S"))

roadsFieldsList = ["OID@","ROAD_NAME"]
roadsDict = {r[0]:r[1] for r in arcpy.da.SearchCursor("road_lyr", roadsFieldsList)}

matchFieldsList = ["IN_FID", "OUT_FID"]
matchDict = {r[0]:roadsDict[r[1]] for r in arcpy.da.SearchCursor(out_table, matchFieldsList)}
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
updateFieldsList = ["OID@", "ROAD_NAME"]
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍with arcpy.da.UpdateCursor("update_lyr", updateFieldsList) as updateRows:  
    for updateRow in updateRows:  
        keyValue = updateRow[0]  
        # verify that the keyValue is in the Dictionary  
        if keyValue in matchDict:  
             updateRow[1] = matchDict[keyValue]  
             updateRows.updateRow(updateRow)
  
print("Updated Features: " + strftime("%Y-%m-%d %H:%M:%S"))

del roadsDict
del matchDict  
print("Script Finish: " + strftime("%Y-%m-%d %H:%M:%S"))
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

ThomasColson · ‎11-17-2018

Richard thanks for the help! I finally had some time to sit down and implement your solution, here's what I got to work with comments, so others can follow. I plan to bake this into a python toolbox and will share.

from time import strftime  
  
print("Script Start: " + strftime("%Y-%m-%d %H:%M:%S"))

import arcpy
#Feature class to be updated
update_feature_class = r'C:\Temp\NEAR\GRSM.gdb\Points_Of_Interest\AED_LOCATIONS'
#Feature class containing the attribute we want is nearest
road_feature_class = r'C:\Temp\NEAR\GRSM.gdb\Transportation\GRSM_ROADS'

arcpy.MakeFeatureLayer_management(update_feature_class,"update_lyr")
arcpy.MakeFeatureLayer_management(road_feature_class,"road_lyr")

print("Created Layers: " + strftime("%Y-%m-%d %H:%M:%S"))
#create a temporary table of near results
out_table = r'C:\Temp\NEAR\GRSM.gdb\AED_LOCATIONS_CLOSE'
if arcpy.Exists(out_table):
  arcpy.Delete_management(out_table)
  print("Deleted Near Table: " + strftime("%Y-%m-%d %H:%M:%S"))

arcpy.GenerateNearTable_analysis("update_lyr", "road_lyr", out_table)
  
print("Generated Near Table:" + strftime("%Y-%m-%d %H:%M:%S"))
#create a list of attributes from the near feature class where
#the second field is the attribute we want to know is the nearest
roadsFieldsList = ["OBJECTID","RDNAME"]
roadsDict = {r[0]:r[1] for r in arcpy.da.SearchCursor("road_lyr", roadsFieldsList)}

matchFieldsList = ["IN_FID", "NEAR_FID"]
matchDict = {r[0]:roadsDict[r[1]] for r in arcpy.da.SearchCursor(out_table, matchFieldsList)}
#write the nearest attribute to the update feature class
#where the second field is the attribute that is being updated with
#the name of the nearest feature
updateFieldsList = ["OBJECTID", "TEST"]
with arcpy.da.UpdateCursor("update_lyr", updateFieldsList) as updateRows:  
    for updateRow in updateRows:  
        keyValue = updateRow[0]  
        # verify that the keyValue is in the Dictionary  
        if keyValue in matchDict:  
             updateRow[1] = matchDict[keyValue]  
             updateRows.updateRow(updateRow)
  
print("Updated Features: " + strftime("%Y-%m-%d %H:%M:%S"))

del roadsDict
del matchDict  
print("Script Finish: " + strftime("%Y-%m-%d %H:%M:%S"))
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍