Select to view content in your preferred language

Use ArcPy to combine geoprocessing tools in a loop

198
6
2 weeks ago
Labels (3)
DanHardwick
New Contributor II

I have been carrying out some manual analysis on a polygon dataset, selecting all points from a feature layer within 5km and populating the selected records using field calculator. What I am now wanting to do is use some Python to automate this, as I have several features layers to do this for.

This is what I have been running:

arcpy.management.SelectLayerByLocation(
in_layer="Waterbodies",
overlap_type="WITHIN_A_DISTANCE",
select_features="Offtaker_Utility",
search_distance="5 Kilometers",
selection_type="NEW_SELECTION",
invert_spatial_relationship="NOT_INVERT")

arcpy.management.CalculateField(
in_table="Waterbodies",
field="Utility_5km",
expression='"Yes"',
expression_type="PYTHON3",
code_block="",
field_type="TEXT",
enforce_domains="NO_ENFORCE_DOMAINS")

I could just continue to do this manually or even just edit the script to change 'select_features' in selectbylocation, and the 'field' in calculatefield, but I want to automate this if possible. I am fairly new to ArcPy and haven't used it much in the last 2-3 years, so slowly learning my way again and trying to automate some tasks using it to get some real world and work workflows setup.

I decided to do this in an ArcGIS Pro notebook, and so far I have:

  1. set the workspace to look at my geodatabase
  2. Used a function to list all feature classes
  3. Filtered those feature classes to show only the ones relevant for this analysis
  4. Carried out the first geoprocessing task of looping through each feature and selecting the relevant polygons

This is the code:

arcpy.env.workspace = r"my file path\Datscha_output.gdb"
feature_classes = arcpy.ListFeatureClasses()


filtered_fcs = [fc for fc in feature_classes if "Offtaker" in fc]
for fc in filtered_fcs:
    print (fc)
    print (filtered_fcs)

for fc in filtered_fcs:
    arcpy.management.SelectLayerByLocation(
    in_layer="Waterbodies_Definitive_Version",
    overlap_type="WITHIN_A_DISTANCE",
    select_features=fc,
    search_distance="5 Kilometers",
    selection_type="NEW_SELECTION",
    invert_spatial_relationship="NOT_INVERT"
)

This all works, but I also need to incorporate another geoprocessing task of calculating a field. This is where I am a bit stuck. I think I need to include the calculate field function into the above loop, as I need to tell the script to select all points from Offtaker_a, then calculate the relevant field in polygon dataset, then loop onto the next point feature of Ottaker_b etc. etc.

The calculate field function has a parameter for what field to update, so I presume I must need to create a variable to look at each field in the table, or similar?

Tags (3)
0 Kudos
6 Replies
BobBooth1
Esri Contributor

Hi Dan!

If you know the field names already you don't need to get a list of them.

Here are some resources that may be useful.

https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/calculate-field.htm

https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/calculate-field-examples.htm

One good way to get code to start from is to run the geoprocessing tool, then go to the Geoprocessing History (click the Analysis tab and click History)

 https://pro.arcgis.com/en/pro-app/latest/help/analysis/geoprocessing/basics/geoprocessing-history.ht...

In the History, right-click the run of the tool and copy Python Command.

Paste that snippet into an editor (Notebook, IDLE, Python Window) to see the syntax.

This series of tutorials may also be useful:

https://learn.arcgis.com/en/paths/learn-python-in-arcgis-pro/

 

 

 

 

0 Kudos
DanHardwick
New Contributor II

Hi Bob, many thanks for the response. 

Those resources are useful, especially seeing what parameters can be used within each function. I often use the Geoprocessing history, which helps a lot! 

I am still not quite sure how I write my script to use the select layer by location function and then run that selection through the calculate field function for the relevant field. In the code snippet above I have looped through the SelectByLocation function for each of my datasets, which selects all the relevant polygons, but I need to break this down to loop through a single dataset, then use field calculator for the relevant field, then move onto the next dataset and rinse and repeat. I am sure it is quite straight forward to do, but I haven't quite worked it out yet! 

0 Kudos
BobBooth1
Esri Contributor

Dan,

Within the same loop that you do the SelectLayerByLocation, you can do other operations, such as Calculate field.  Calculate field respects selections on layer.

Just bring that last closing parenthesis up to close the second to last line 

    invert_spatial_relationship="NOT_INVERT")

and then add a line at the same indentation level (4 spaces) as the rest of the code in the loop to do the field calculation.

This tutorial shows describing feature classes and processing them in a loop:

https://learn.arcgis.com/en/projects/automate-a-geoprocessing-workflow-with-python/

I'm not sure what values you want to calculate into the field, but if you do it manually once and copy the Python syntax for doing it, that will let you add that as a line within your loop. You'll have to edit it to use your variables, for example, use fc instead of the feature class name, but I'm presuming the field name doesn't change. If it does, there are ways to deal with that. 

 

0 Kudos
DanHardwick
New Contributor II

Many thanks Bob. 

I have amended my code to include the calculate field operation, but as you mentioned, my field does actually change. 

Essentially what I am doing is selecting all of the polygons within 5km of a point dataset 'offtaker_Transport', then using field calculator to calculate the 'Transport_5km' field. Then I need to do the same for a number of other point datasets and the relevant fields. I think my code as it is now, will loop through my point datasets (offtakers) and update the 'Transport_5km' field:

arcpy.env.workspace = r"my file path\Datscha_output.gdb"
feature_classes = arcpy.ListFeatureClasses()


filtered_fcs = [fc for fc in feature_classes if "Offtaker" in fc]
for fc in filtered_fcs:
    print (fc)
    print (filtered_fcs)
for fc in filtered_fcs:
    arcpy.management.SelectLayerByLocation(
    in_layer="Waterbodies_Definitive_Version",
    overlap_type="WITHIN_A_DISTANCE",
    select_features=fc,
    search_distance="5 Kilometers",
    selection_type="NEW_SELECTION",
    invert_spatial_relationship="NOT_INVERT")
    arcpy.management.CalculateField(
    in_table="Waterbodies_Definitive_Version",
    field="Transport_5km",
    expression='"Yes"',
    expression_type="PYTHON3",
    code_block="",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

 

I think the next thing to do is to to possibly list all of the fields in my polygon data (waterbodies), and tell the calculatefield operation to only update the relevant field. I have managed to list all of the fields in my dataset:

fieldlist = arcpy.ListFields (r"My file path location.gdb")

for field in fieldlist:
    print(f"{field.name}")

But not sure how to incorporate this into the loop to update the relevant field. Tried a couple of things like putting the returned fields into a list and calling the relevant field by their index, but no luck and not sure this is the most efficient way of doing it. Any help is greatly appreciated. Thank you.    

0 Kudos
DanHardwick
New Contributor II

Just an update on this, but along with the code above, I have created a variable for the relevant fields: 

 

 

Waterbody_featureclass = "my file location/.gdb/Waterbodies"
field_names = [f.name for f in arcpy.ListFields(Waterbody_featureclass)]
field_names_query = field_names [-6:]
print (field_names_query)
['Education_5km', 'Hospital_5k', 'Industrial_5km', 'Leisure_5km', 'Transport_5km', 'Utility_5km']

 

 

 

I believe I can incorporate this into the 'CalculateField' operation I posted above, to use field_names_query for the field, but I think it will loop through the fields in a different order to the SelectLayerByLocation operation, meaning the wrong field will be selected and updated. 

Is there a way to order the variable lists I have? Is this the right way to be going about this? I want to tell the script to select the waterbodies which are within 5km of for example 'offtaker_education' points, then update the relevant waterbody 'Education_5km' field. 

This is the full code I am using:

 

 

arcpy.env.workspace = r"my .gdb location"
feature_classes = arcpy.ListFeatureClasses()
print (feature_classes)

['Properties_Utility', 'PROPERTIES_RATING_JOIN', 'Datscha_Properties_TWUL', 'Properties_Severn_Trent', 'Properties_Yorkshire_Water_Services', 'PROPERTIES_ADDRESS_JOIN_YORKSHIRE_WATER', 'PROPERTIES_ADDRESS_JOIN_SEVERN_TRENT', 'Offtaker_Education_College_University', 'Offtaker_Utility', 'Offtaker_Industrial', 'Offtaker_Leisure', 'Offtaker_Hospital_Hospice', 'Offtaker_Transport']

filtered_fcs = [fc for fc in feature_classes if "Offtaker" in fc]
for fc in filtered_fcs:
    print (fc)

Offtaker_Education_College_University
Offtaker_Utility
Offtaker_Industrial
Offtaker_Leisure
Offtaker_Hospital_Hospice
Offtaker_Transport

Waterbody_featureclass = "My file location"
field_names = [f.name for f in arcpy.ListFields(Waterbody_featureclass)]
field_names_query = field_names [-6:]

print (field_names_query)
['Education_5km', 'Hospital_5k', 'Industrial_5km', 'Leisure_5km', 'Transport_5km', 'Utility_5km']

for fc in filtered_fcs:
    arcpy.management.SelectLayerByLocation(
    in_layer="Waterbodies_Definitive_Version",
    overlap_type="WITHIN_A_DISTANCE",
    select_features=fc,
    search_distance="5 Kilometers",
    selection_type="NEW_SELECTION",
    invert_spatial_relationship="NOT_INVERT")
    arcpy.management.CalculateField(
    in_table="Waterbodies_Definitive_Version",
    field="field_names_query",
    expression='"Yes"',
    expression_type="PYTHON3",
    code_block="",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

 

I believe how the script is now, it will select and update the Education field correctly, then move onto selecting the utility points, and update the hospital 5k field, which is incorrect. 

 

0 Kudos
BobBooth1
Esri Contributor

So, given that you know the field names and the feature class names, you could make a Python dictionary mapping from the feature class name to the field name:

https://www.w3schools.com/python/python_dictionaries.asp

field_map_dict = {
  "Offtaker_Education_College_University": "Education_5km",
  "Offtaker_Utility": "Utility_5km",
  "Offtaker_Industrial": "Industrial_5km",
 "Offtaker_Leisure": "Leisure_5km",
 "Offtaker_Hospital_Hospice": "Hospital_5k",
 "Offtaker_Transport": "Transport_5km"}

 

And then use something like this to call up the right value for the field to use in the calculation:

field_name = field_map_dict[fc])

0 Kudos