Calls to arcpy.management tools get progressively slower as they are called multiple times within a script

1562
8
Jump to solution
12-08-2021 03:16 PM
by Anonymous User
Not applicable

I have a Python script using ArcPy. This script iterates over a set of ObjectIDs from a feature class ("Blocks") and calls a function to perform some calculations:

import arcpy
import datetime

def get_block_metrics(block_objectid):
    # retrieve the points that lie within the block
    # points and blocks are in different Enterprise geodatabases
    block = arcpy.management.SelectLayerByAttribute(BLOCKS_FEATURE_CLASS, "NEW_SELECTION", "OBJECTID = 0".format(block_objectid), None)
    points_in_block = arcpy.management.SelectLayerByLocation(points_layer, "INTERSECT", block, None, "NEW_SELECTION", "NOT_INVERT")

    # return if there are no points within the block
    if int(arcpy.management.GetCount(points_in_block)[0]) == 0:
        del points_in_block
        return

    block_metrics = {}

    # calculate some statistics for the block from the points
    # this includes creating a summary statistics table on points_in_block
    # also opening a search cursor on points_in_block and retrieving values

    return block_metrics

 

Initially, this method will be called approximately 10,000 times. After, the script will be run once per day during which a call to this function will be made on the order of 100 times.

The first time this function is called, the calls to arcpy.management.SelectLayerByAttribute() and arcpy.management.SelectLayerByLocation() take a total of less than 1 second to complete.
The 100th time this method is called, the calls to these tools take a combined time of more than 5 seconds.
The 200th time this method is called, the calls to these tools take a combined time of more than 12 seconds.

After this method returns some insert and update cursors are opened on tables in the same enterprise geodatabase as the Blocks feature class, but the performance of these is stable for each iteration.
Why does the time taken to make these calls increase as additional calls are made? How can this be avoided?

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
GKmieliauskas
Esri Regular Contributor

Hi,

In one of our projects we have found that performance issue could be with using featureclass instead of featurelayer in SelectLayerByLocation method when you call it in calculation cycle. You need to make FeatureLayer from FeatureClass before calculation cycle. 

Data type for  in_layer for SelectLayerByLocation method must be FeatureLayer, but it works and with FeatureClass.

View solution in original post

8 Replies
DanPatterson
MVP Esteemed Contributor

@KoryKramer has lots of suggestions in

Troubleshooting Performance Issues in ArcGIS Pro - Esri Community

key points

  • spatial and attribute indices 
  • check spatial references are the same  
  • metadata history
  • geoprocessing history
  • use in_memory

see the Geoprocessing Check specifically.

And are you monitoring memory consumption as this happens?


... sort of retired...
0 Kudos
IhabHassan
Esri Contributor

@Anonymous User I recently experienced something very similar, but with Select By Location tool. 

I wrote script that is processing polylines by running couple of "Select By Location" intersecting with another Point and Polyline feature classes.. Running the code using Pro default python environment, the script struggles to process 230 records in 5 mins..

Running the same code using ArcGIS Desktop 64x python, the code can handle 500+ records in 3 mins .. eventually I used the ArcGIS Desktop python to finish the task. 

I am using Pro 2.8.2, and ArcGIS Desktop 10.8.1
Thanks for posting about this, I was willing to post about it as well. Hopefully someone can pinpoint what would be issue.

Regards
Ihab
0 Kudos
Luke_Pinner
MVP Regular Contributor

Troubleshooting Performance Issues in ArcGIS Pro makes me think turning off geoprocessing history logging might help.  

Try:

arcpy.SetLogMetadata(False)
arcpy.SetLogHistory(False)

 

GKmieliauskas
Esri Regular Contributor

Hi,

In one of our projects we have found that performance issue could be with using featureclass instead of featurelayer in SelectLayerByLocation method when you call it in calculation cycle. You need to make FeatureLayer from FeatureClass before calculation cycle. 

Data type for  in_layer for SelectLayerByLocation method must be FeatureLayer, but it works and with FeatureClass.

by Anonymous User
Not applicable

This was the correct solution. You can see in my code that the call to SelectLayerByAttribute was being passed BLOCKS_FEATURE_CLASS, which referenced a feature class, not a layer. I made a feature layer from the feature class, and now performance is consistent for each iteration.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

A minor point of clarification, the in_layer for Select Layer By Location used to require a layer, but Esri made a change several versions ago (can't remember the specific version since it was quite a while back) where the tool takes both a layer or a feature class.  When a feature class is passed to the tool, it dynamically creates a feature layer and that is what is returned from the tool.

>>> import arcpy
>>> fc =  # path to feature class
>>>
>>> sel_lyr = arcpy.management.SelectLayerByAttribute(fc, where_clause="ObjectID < 5")
>>> sel_lyr.getOutput(0)
<arcpy._mp.Layer object at 0x00000218EB275448>
>>> 

 

*I used Select Layer By Attribute to demonstrate because it was quicker, but both tools behave the same.

I think in this case, creating layers over and over was causing the performance issue rather than creating a layer or two and reusing them over and over.

0 Kudos
DonMorrison1
Occasional Contributor III

I recently stumbled over this also with SelectLayerByAttribute. I found if I created a layer using MakeFeatureLayer and passing that into SelectLayerByAttribute instead of passing in the feature class, it ran many times faster.

0 Kudos
by Anonymous User
Not applicable

Thank you everyone for your suggestions.

0 Kudos