Iterate selection and calculate field not working

ThomasColson · ‎09-22-2018

With the following, I am trying to select each polygon by its name, select points from another feature class that are within the selected polygon, then calculate a field using the same name that was used to select the polygon (by attributes). A spatial intersect, or any other method that results in creating a new FC as output and having to do joins won't work here, as this is (hopefully) to be an unattended nightly update script. It iterates through the polygons fine, but the result is that ALL points get the name of the last polygon.

import os
import errno
import arcpy
import sys  
import traceback  
from arcpy import env
update_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\GRSM_RESEARCH_LOCATIONS'
county_feature_class = r'C:\Users\tcolson\Documents\ArcGIS\Projects\MyProject\MyProject.gdb\CountyBoundaries'

with arcpy.da.SearchCursor(county_feature_class,['SHAPE@','NAME']) as cursor:
    for row in cursor:
        print('Selecting '+row[1])
        expression = "NAME = '"+row[1]+"'"
        arcpy.SelectLayerByAttribute_management (county_feature_class, "NEW_SELECTION", expression)
        arcpy.SelectLayerByLocation_management(update_feature_class, "INTERSECT", county_feature_class)
        arcpy.CalculateField_management(update_feature_class, "COUNTY","'"+row[1]+"'", "PYTHON3")‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

BenjaminSperry1 · ‎09-26-2018

Hello Thomas,

Here is a simple script you can work from.

import arcpy

# Establish parameters
city_feature_class = arcpy.GetParameterAsText(0)  # A point fc (string)
county_feature_class = arcpy.GetParameterAsText(1)  # A polygon fc (string)

# Set up search cursor for county fc
with arcpy.da.SearchCursor(county_feature_class, ['SHAPE@', 'NAME']) as search_cursor:
    # Set up update cursor for point fc
    with arcpy.da.UpdateCursor(city_feature_class, ['SHAPE@', 'Test_County']) as cursor:

        # Iterate through counties
        for polygon in search_cursor:
            county = polygon[0]
            name = polygon[1]

            # Iterate through cities
            for row in cursor:
                point = row[0]
                if point.within(county):
                    row[1] = name
                    cursor.updateRow(row)

            cursor.reset()

del search_cursor
del cursor

I tested it by running the code to add county names to every city within the county and then compared it against the county field in the city feature class to make sure it ran correctly.

It is pretty good performance wise, when I ran it against all the cities (1,015) in all the counties (67) in Florida it ran in less than a minute. When I ran all the cities (38,186) in all the counties (3,142) in the entire US it took just under an hour. If you are running it over night it shouldn't be an issue.

Cheers,

Ben

RichardFairhurst · ‎09-26-2018

I would add a break after matching the first County name, since your code only stores one county name and there is no point in continuing to match county names after that. However, I think if you run my code you will find that reseting a cursor for every point is much slower than iterating over a dictionary or list which were loaded into memory by running the County cursor once. I believe the cursor reset is repeatedly accessing data on disk, which will be very time consuming in a process that grows exponentially as each new county or point is added. I would encourage you to try either of my last two code examples on your test data to see for yourself the difference that using in memory data can make. Every optimization in the part of the code that is subject to exponential growth should be considered critical to making a process like this at all scalable.

DanPatterson_Retired · ‎09-27-2018

From https://community.esri.com/thread/221620-iterate-selection-and-calculate-field-not-working#comment-8...

pip_arc

len(pnts)
2000

len(polys)
100

%timeit pip_arc(pnt_fc, poly_fc)
1.44 s ± 118 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)‍‍

plus another second or so for ExtendTable

Why are the alternatives taking so long? is it because of where they reside

BenjaminSperry1 · ‎09-28-2018

Perhaps we are all barking up the wrong tree.

You could just do a spatial join on the new points from that day with a persistent county fc with the output in_memory with just the name field carried over from the county fc.

Then append that to a long term points fc.

You could then programmatically delete all that days features from the new points fc so you are working from an empty data set each day.

Roughly (I'm sure Richard will pick up on any errors):

import arcpy

# Establish parameters
new_features = #  A temporary feature class that is reset each day
county_feature_class = #  The county feature class used to get county names
historical_features = # A feature class you add new points to as they come along


county_points = arcpy.analysis.SpatialJoin(new_features, county_feature_class, "in_memory/County_points",
                                           "JOIN_ONE_TO_MANY", "KEEP_ALL",[fields you want to cary over],
                                           "INTERSECT", None, None)

arcpy.Append_management(county_points, historical_features, "NO_TEST",...)

When I tested is using GP tools within ArcGIS Pro it only took 7 seconds total for all the cities an counties in the US.

ThomasColson · ‎10-13-2018

This worked as well, https://community.esri.com/thread/221620-iterate-selection-and-calculate-field-not-working#comment-8... albeit very fast!