CalculateGeometryAttributes produces None values in multiprocessing

HieuTran12 · ‎06-18-2024

I was trying to perform a geo analysis and the function CalculateGeometryAttributes produced None values when I wrapped the script in multiprocessing, but it worked correctly in a for loop.

def process(fc):
        arcpy.env.workspace = "G:/usgrids2020/test1/"
        arcpy.env.overwriteOutput=True
        state = fc[34:-7]
        print(f"Processing for {state.upper()}")
        readTime = datetime.datetime.now()
        # arcpy.env.workspace = os.path.join(gdb_path, f"usa{state}.gdb")
        #Import the feature for boundary and water
        # fc = f"usa{state[3:5]}_joined"
        fcInMem = arcpy.management.CopyFeatures(fc, f"in_memory/usa{state}_fcInMem")
        water_fc = f"{fc[:-7]}_water_layer"
        waterInMem = arcpy.management.CopyFeatures(water_fc, f"in_memory/usa{state}_waterInMem")
        #Create the temporary id for joining fields later
        arcpy.AddField_management(fcInMem,'tempid','LONG')
        arcpy.CalculateField_management(fcInMem,"tempid",'!OBJECTID!','PYTHON')
        #Calculate the area(land+water) by GEODESIC method => Kytt recommended
        arcpy.management.AddField(fcInMem, "GEODESIC_AREA", "FLOAT")
        arcpy.management.CalculateGeometryAttributes(fcInMem, "GEODESIC_AREA AREA_GEODESIC", "", "SQUARE_METERS")

This is what it generated:

(None, 'USA_100030117003015')
(None, 'USA_100030127001011')
(None, 'USA_100030151002006')
(None, 'USA_100030121001003')
(None, 'USA_100030112051009')
(None, 'USA_100030148102015')
(None, 'USA_100030112062022')
(None, 'USA_100030168063009')
(None, 'USA_100030166141010')
(None, 'USA_100030164041013')

HaydenWelch · ‎06-20-2024

Did you make sure that the feature units are meters? If so you can try this:

import arcpy
import datetime

def process(fc):
        arcpy.env.workspace = "G:/usgrids2020/test1/"
        arcpy.env.overwriteOutput=True
        state = fc[34:-7]
        print(f"Processing for {state.upper()}")
        readTime = datetime.datetime.now()
        # arcpy.env.workspace = os.path.join(gdb_path, f"usa{state}.gdb")
        #Import the feature for boundary and water
        # fc = f"usa{state[3:5]}_joined"
        fcInMem = arcpy.management.CopyFeatures(fc, f"in_memory/usa{state}_fcInMem")
        water_fc = f"{fc[:-7]}_water_layer"
        waterInMem = arcpy.management.CopyFeatures(water_fc, f"in_memory/usa{state}_waterInMem")
        #Create the temporary id for joining fields later
        arcpy.AddField_management(fcInMem,'tempid','LONG')
        arcpy.CalculateField_management(fcInMem,"tempid",'!OBJECTID!','PYTHON')
        #Calculate the area(land+water) by GEODESIC method => Kytt recommended
        arcpy.management.AddField(fcInMem, "GEODESIC_AREA", "FLOAT")
        with arcpy.da.UpdateCursor(fcInMem, ["SHAPE@", "GEODESIC_AREA"]) as cursor:
            for row in cursor:
                row = dict(zip(cursor.fields, row))
                row['GEODESIC_AREA'] = row['SHAPE@'].getArea('GEODESIC', 'SquareMeters')
                cursor.updateRow(row.values())
        #arcpy.management.CalculateGeometryAttributes(fcInMem, "GEODESIC_AREA AREA_GEODESIC", "", "SQUARE_METERS")

Which uses the Polygon.getArea() method. It pulls the whole shape object though so if you're doing this for millions of rows it can slow down a bit, but you'll get an accurate Geodesic area.

View solution in original post

DanPatterson · ‎06-18-2024

does the newer "memory" workspace behave differently than the "in_memory" workspace with multiprocessing?

I suspect a for loop would be better if using memory workspaces in any event

... sort of retired...

HieuTran12 · ‎06-18-2024

I tried without the "in_memory" as well but the result remains the same.

JoshuaBixby · ‎06-19-2024

Commented at python - ArcPy function produced None values in multiprocessing - GIS SE, it looks to be a defect with CalculateGeometryAttributes. I was able to create a minimum reproducible example and have submitted to Esri Support to log a defect.

HieuTran12 · ‎07-01-2024

Hi Joshua, have you heard anything from them yet?

Apparently, I need that function in order to run to get some data from it. There is alternative solution that Hayden posted below but I couldn't use it for my case.

JoshuaBixby · ‎07-01-2024

Spent time working through the issue with Esri Support, and it is complex. The issue has nothing to do with any specific type of workspace, nor anything with CalculateGeometryAttributes itself, it has to do with how Esri has structured their code for hundreds of Python-based tools like CalculateGeometryAttributes. There are literally hundreds of tools that will all fail to run with Python multiprocessing. The crazy part is, for most of the tools only a single line of code needs to be changed to make it work.

The question then becomes are these issues with documentation or the software, are these defects or enhancement requests, etc.... I haven't closed out the Esri Support case yet, but I suspect it will be a mixture of documentation defects and software enhancements that come out of it.

In the meantime, one can at least roll their own code with ArcPy DA cursors.

HaydenWelch · ‎06-20-2024

Since this seems to be a bug as per @JoshuaBixby's reply, you could try using an UpdateCursor to get the area for now:

import arcpy
import datetime

def process(fc):
        arcpy.env.workspace = "G:/usgrids2020/test1/"
        arcpy.env.overwriteOutput=True
        state = fc[34:-7]
        print(f"Processing for {state.upper()}")
        readTime = datetime.datetime.now()
        # arcpy.env.workspace = os.path.join(gdb_path, f"usa{state}.gdb")
        #Import the feature for boundary and water
        # fc = f"usa{state[3:5]}_joined"
        fcInMem = arcpy.management.CopyFeatures(fc, f"in_memory/usa{state}_fcInMem")
        water_fc = f"{fc[:-7]}_water_layer"
        waterInMem = arcpy.management.CopyFeatures(water_fc, f"in_memory/usa{state}_waterInMem")
        #Create the temporary id for joining fields later
        arcpy.AddField_management(fcInMem,'tempid','LONG')
        arcpy.CalculateField_management(fcInMem,"tempid",'!OBJECTID!','PYTHON')
        #Calculate the area(land+water) by GEODESIC method => Kytt recommended
        arcpy.management.AddField(fcInMem, "GEODESIC_AREA", "FLOAT")
        with arcpy.da.UpdateCursor(fcInMem, ["SHAPE@AREA", "GEODESIC_AREA"]) as cursor:
            for row in cursor:
                row = dict(zip(cursor.fields, row))
                row['GEODESIC_AREA'] = row['SHAPE@AREA'] # This is in feature units, so you will likely need the conversion factor from arcpy.LinearUnitConversionFactor
                cursor.updateRow(row.values())
        #arcpy.management.CalculateGeometryAttributes(fcInMem, "GEODESIC_AREA AREA_GEODESIC", "", "SQUARE_METERS")

HieuTran12 · ‎06-20-2024

I tried your solution, but the results are very far off from the Geodesic method for area calculations.

HaydenWelch · ‎06-20-2024

Did you make sure that the feature units are meters? If so you can try this:

import arcpy
import datetime

def process(fc):
        arcpy.env.workspace = "G:/usgrids2020/test1/"
        arcpy.env.overwriteOutput=True
        state = fc[34:-7]
        print(f"Processing for {state.upper()}")
        readTime = datetime.datetime.now()
        # arcpy.env.workspace = os.path.join(gdb_path, f"usa{state}.gdb")
        #Import the feature for boundary and water
        # fc = f"usa{state[3:5]}_joined"
        fcInMem = arcpy.management.CopyFeatures(fc, f"in_memory/usa{state}_fcInMem")
        water_fc = f"{fc[:-7]}_water_layer"
        waterInMem = arcpy.management.CopyFeatures(water_fc, f"in_memory/usa{state}_waterInMem")
        #Create the temporary id for joining fields later
        arcpy.AddField_management(fcInMem,'tempid','LONG')
        arcpy.CalculateField_management(fcInMem,"tempid",'!OBJECTID!','PYTHON')
        #Calculate the area(land+water) by GEODESIC method => Kytt recommended
        arcpy.management.AddField(fcInMem, "GEODESIC_AREA", "FLOAT")
        with arcpy.da.UpdateCursor(fcInMem, ["SHAPE@", "GEODESIC_AREA"]) as cursor:
            for row in cursor:
                row = dict(zip(cursor.fields, row))
                row['GEODESIC_AREA'] = row['SHAPE@'].getArea('GEODESIC', 'SquareMeters')
                cursor.updateRow(row.values())
        #arcpy.management.CalculateGeometryAttributes(fcInMem, "GEODESIC_AREA AREA_GEODESIC", "", "SQUARE_METERS")

Which uses the Polygon.getArea() method. It pulls the whole shape object though so if you're doing this for millions of rows it can slow down a bit, but you'll get an accurate Geodesic area.

HieuTran12 · ‎06-20-2024

Wow, this could be an alternate method of CalculateGeometryAttributes. Or does it have any tradeoffs?