Select to view content in your preferred language

The woes of running geoprocessing tasks in a for loop via a .pyt file - Slowdowns no matter what

271
5
08-06-2024 03:35 PM
egooodwin_digit
Emerging Contributor

I have been creating a script based off the Production Mapping team's  Batch Make Grids and Graticules script, but adapted for use in ArcGIS Pro. Looping the geoprocessing has caused processing times to balloon - though I'm working on a server with plenty of CPU and RAM (100+gb).

In short, this .pyt script loops through a polygon feature class representing a grid, and for every tile in the grid (i.e., for each polygon in the feature class), it runs the arcpy.topographic.MakeGridsAndGraticulesLayer() and then exports this layer using arcpy.management.SaveToLayerFile(). Thus the output is a) the feature dataset containing the grid and graticule features, and b) a folder of .lyrx files that refer to the feature dataset, for each tile in the grid. 

The first 15 runs take an average of 10 seconds to complete. By run 70, we are looking at ~30 seconds per run. This continues until it takes 10+ minutes per run after a few hundred iterations. I track the memory usage and it is increasing slowly - about 150mb per 100 loops. The problem is, I need to run this process about 4000 times. 

Below is the executable function of the .pyt tool. I've also attached the .pyt file in its entirety. You can see I've accounted for removing the layers in memory, garbage collection, turned off log history, etc. Something is still causing a slowdown, and I am at a loss. 

    def execute(self, parameters, messages):
        """The source code of the tool."""
        try:
            arcpy.SetLogHistory(False)
            arcpy.SetLogMetadata(False)
            arcpy.env.addOutputsToMap = False
            Grid_Template_XML_file = parameters[0].valueAsText
            input = parameters[1].valueAsText
            Input_Feature_Dataset = parameters[2].valueAsText
            Name_Field = parameters[3].valueAsText
            Rotation_Field = parameters[4].valueAsText
            Output_Folder = parameters[5].valueAsText
            aprx = arcpy.mp.ArcGISProject("CURRENT")
            map_obj = aprx.listMaps()[0]
            
            # Local variables
            inDesc = arcpy.Describe(input)
            FieldName = inDesc.shapeFieldName
            feature_cnt = int(arcpy.management.GetCount(input)[0])


            messages.addMessage("\n*****************************")
            messages.addMessage("No. of Features selected = " + str(feature_cnt))
            messages.addMessage("*****************************")

            String_Array = os.path.splitext(os.path.basename(Grid_Template_XML_file))[0]
            failed_features = []

            feature_cntr = 0
            
            # Check that the input has features
            if feature_cnt > 0:
                # Open a Search Cursor to iterate through the AOI feature class/layer and create grids
                with arcpy.da.SearchCursor(input, ["SHAPE@", Name_Field]) as cursor:
                    for row in cursor:
                        feature_cntr += 1
                        AOI_Feature = row[0]
                        Grid_Name = row[1]
                        Rotation = row[2] if Rotation_Field else None
                        # Set the step progressor
                        arcpy.SetProgressor("step", f"Creating grids and graticules for feature with MSNAME = {Grid_Name}", 0, feature_cnt, 1)
                        arcpy.SetProgressorPosition(feature_cntr)
                        start_time = time.time()
                        # Concatenate the grid's name to the output layer's name for unique identification
                        Output_Layer = f"memory\\{String_Array}_{Grid_Name}"
                        try:
                            if Rotation_Field and Rotation is not None:
                                arcpy.topographic.MakeGridsAndGraticulesLayer(in_grid_xml=Grid_Template_XML_file, area_of_interest=AOI_Feature, target_feature_dataset=Input_Feature_Dataset, out_layer_name=Output_Layer, grid_name=Grid_Name, Rotation=Rotation)
                            else:
                                arcpy.topographic.MakeGridsAndGraticulesLayer(in_grid_xml=Grid_Template_XML_file, area_of_interest=AOI_Feature, target_feature_dataset=Input_Feature_Dataset, out_layer_name=Output_Layer, grid_name=Grid_Name)

                            if Output_Folder:
                                arcpy.management.SaveToLayerFile(Output_Layer, Output_Folder + "\\" + f"{String_Array}_{Grid_Name}", "ABSOLUTE")

                            arcpy.management.Delete(Output_Layer)
                            arcpy.management.Delete("memory")
                            layers = map_obj.listLayers()
                            for layer in layers:
                                map_obj.removeLayer(layer)
                            gc.collect()
                        except Exception as e:
                            messages.addMessage("There was an error with this tile")
                            messages.addErrorMessage(str(e))
                            failed_features.append(Grid_Name)
                        # Log memory usage
                        process = psutil.Process(os.getpid())
                        mem_info = process.memory_info()
                        messages.addMessage(f"Memory usage after processing feature {Grid_Name}: {mem_info.rss / (1024 * 1024):.2f} MB")
                        end_time = time.time()
                        elapsed_time = end_time - start_time
                        messages.addMessage(f"Time taken for feature {Grid_Name}: {elapsed_time:.2f} seconds")
                messages.addMessage(f"\nTotal Warnings: {len(failed_features)}")
                messages.addMessage(f"Grid creation failed for features: {failed_features}")
            else:
                raise No_Features(feature_cnt)
        except No_Features as e:
            messages.addMessage("The input has no features")

        arcpy.ResetProgressor()
        if not failed_features:
            if Output_Folder:
                messages.addMessage(f"Grid creation successful for all features. Layers can be readded from : {Output_Folder}")
            else:
                messages.addMessage("Grid creation successful for all features. Please use the Add Grid Data tool in Production Mapping to add the outputs to your map")
        messages.addMessage("Script Complete")

 

0 Kudos
5 Replies
egooodwin_digit
Emerging Contributor

Didn't attach the .pyt file but if anyone would like to see the entire file, I am happy to send it to someone. 

0 Kudos
TonyAlmeida
Frequent Contributor

A few suggestions.

1. # Cleanup arcpy.management.Delete(Output_Layer)

arcpy.management.Delete("memory\\*") # Ensure memory is cleared

2. Instead of removing each layer, you could replace or overwrite them..

3. After a set number of iterations (100 or what ever), release all objects, reset the aprx, and clear memory.

4. Use arcpy.env.overwriteOutput = True to reduce unnecessary file handling

0 Kudos
egooodwin_digit
Emerging Contributor

Thanks for the suggestions, Tony. I identified the bottleneck as the step in the geoprocessing tool that appends new features to the feature classes in the feature dataset. I did this by running a few hundred iterations on one feature dataset, and then running a few hundred iterations on a new, empty feature dataset. Switching to an empty one brought the processing times back down. When I then reset the kernel and project, and tried to run another set of iterations on a feature dataset that had data in it, it immediately took a long time. So, I believe it has do to with there already being hundreds/thousands of features inside the feature dataset. Because the tool is proprietary to ESRI, I cannot diagnose the slowdown occurring inside the tool's code. 

0 Kudos
HaydenWelch
Frequent Contributor

I tried to reduce your code down as much as possible to make it easier to see where the memory usage is coming from. I also moved the operation out of a SearchCursor and instead create a list of records to iterate over. It's generally bad practice to do anything in a cursor loop beyond collecting records, inserting records, and updating/deleting records. Also tried using `del` on the result object of the arcpy function calls. I'm not sure how long there's a reference to them stored (if running in pro, there could be a reference stored in some memory object, but I'm not sure. Calling del should remove all references though)

 

 

import arcpy
import os

class NoFeatures(Exception):
    pass

def execute(self, parameters, messages):
    """The source code of the tool."""
    Grid_Template_XML_file = parameters[0].valueAsText
    input = parameters[1].valueAsText
    Input_Feature_Dataset = parameters[2].valueAsText
    Name_Field = parameters[3].valueAsText
    Rotation_Field = parameters[4].valueAsText
    Output_Folder = parameters[5].valueAsText
    
    fields = ["SHAPE@", Name_Field, Rotation_Field] if Rotation_Field else ["SHAPE@", Name_Field]
    
    # Local variables
    feature_cnt = int(arcpy.management.GetCount(input)[0])
    String_Array = os.path.splitext(os.path.basename(Grid_Template_XML_file))[0]
    failed_features = []
        
    # Check that the input has features
    if int(feature_cnt) == 0:
        arcpy.AddError("The input has no features")
        raise NoFeatures("The input has no features")
    
    # Create a generator to iterate over the features
    feature_grids: list[tuple] = \
        [
            (
                row[0],                             # AOI Feature                    
                row[1],                             # Grid Name           
                row[2] if Rotation_Field else None, # Rotation
            )
            for row in arcpy.da.SearchCursor(input, fields)
        ]
    
        
    # Instead of storing the index in the record, use enumerate to get the index
    for idx, (AOI_Feature, Grid_Name, Rotation) in enumerate(feature_grids):
        arcpy.SetProgressor("step", f"Creating grids and graticules for feature with MSNAME = {Grid_Name}", 0, feature_cnt, 1)
        arcpy.SetProgressorPosition()
        Output_Layer = f"memory\\{String_Array}_{Grid_Name}"
        
        try:
            # Use EnvManager to set overwriteOutput to True and addOutputsToMap to False    
            with arcpy.EnvManager(overwriteOutput=True, addOutputsToMap=False):
                
                # Capture the result object and unpack it so the returned layer can be deleted
                result = arcpy.topographic.MakeGridsAndGraticulesLayer(
                    in_grid_xml=Grid_Template_XML_file, 
                    area_of_interest=AOI_Feature, 
                    target_feature_dataset=Input_Feature_Dataset, 
                    out_layer_name=Output_Layer, 
                    grid_name=Grid_Name, 
                    Rotation=Rotation) # You can pass None to rotation, so you don't need a conditional statement for the call
            
        except Exception as e:
            failed_features.append(Grid_Name)
            arcpy.AddWarning(f"Failed to create grids and graticules for feature with MSNAME = {Grid_Name}")
            arcpy.AddWarning(e)
            continue
        
        if Output_Folder:
            result = arcpy.management.SaveToLayerFile(
                Output_Layer, 
                os.path.join(Output_Folder, f"{String_Array}_{Grid_Name}"), 
                "ABSOLUTE")
            
        del result
        arcpy.Delete_management(Output_Layer)

 

 

 

 

0 Kudos
egooodwin_digit
Emerging Contributor

Thanks for the help, Hayden. The issue was related to running the tool with a feature dataset with existing features - as the dataset grows, the tool slows - regardless of what is in memory. You can read my response to Tony to see the specific process I used to diagnose this issue. 

0 Kudos