I'm writing a python toolbox implementing several different areal interpolation methods, from the most basic areal weighting up to methods that use some statistical techniques.
I'm planning on running some monte carlo simulations to assess the accuracy of each method with a particular set of input data, so I need to make the execution as quick as possible.
My current implementation for areal weighting is below. I've used a searchcursor nested within an updatecursor, which I'm led to believe is not the most efficient way.
Basically the task is to interpolate values from source zones (census polygons) to arbitrary target zone polygons, in this case based simply on the ratio of the area of each source zone that intersects a target zone to the total area of that source zone.
To do this, I loop through the target zones, select just the source zones that intersect the target zone, then loop through those source zones. Where the source zone is entirely contained within the target zone, I just take the total value from that source zone (I'm not sure if this optimization is worth it as all I save is the ratio calculation...), otherwise, I calculate the ratio and multiply it by the value to be interpolated.
I've been trying to think of a way to avoid this inner searchcursor by creating a dictionary outside the updatecursor. The difficulty is that this dictionary does not need to include all the source zones, just those that intersect the target zone. Ideally I'd have a dictionary with the ID of each target zone as the key, and the value would have to be another dictionary for each intersecting source zone with the interpolation and geometry fields as keys and the values of those fields as values. But it seems like to build this dictionary would require running an intersect or spatial join first, or another nested loop, so I'm unsure whether I would really gain much speed this way.
I know I can use cProfile to see exactly how long each part is taking, but it's not clear how best to set it up within a python toolbox - can anyone provide some examples of how best to do this?
Anyway, here is the execute code:
def execute(self, parameters, messages): """The source code of the tool.""" # get parameters inSourceFeatures = parameters.valueAsText inInterpolationFields = parameters.valueAsText.split(";") inTargetFeatures = parameters.valueAsText outTargetFeatures = parameters.valueAsText # use in_memory workspace for quicker results arcpy.env.workspace = 'in_memory' # make feature layers of the source zone feature class arcpy.MakeFeatureLayer_management(inSourceFeatures, "source_zones") # take a copy of the target zones arcpy.CopyFeatures_management(inTargetFeatures, "target_zones") # add fields for the interpolated variables for field in inInterpolationFields: arcpy.AddField_management("target_zones", field, "DOUBLE") # get spatial reference of zones sourceSR = arcpy.Describe("source_zones").spatialReference targetSR = arcpy.Describe("target_zones").spatialReference # for each target zone... with arcpy.da.UpdateCursor("target_zones", inInterpolationFields + ['SHAPE@'], spatial_reference=targetSR) as targetCursor: for targetRow in targetCursor: # create a dictionary for the row for easier field access targetRowDict = dict(zip(inInterpolationFields + ['SHAPE@'], targetRow)) # get target geometry targetGeom = targetRowDict['SHAPE@'] # select source zones that intersect current target zone arcpy.SelectLayerByLocation_management("source_zones", "INTERSECT", targetGeom) # initialise output interpolated value fields for field in inInterpolationFields: targetRowDict[field] = 0.0 # for each intersecting source zone... with arcpy.da.SearchCursor("source_zones", inInterpolationFields + ['SHAPE@'], spatial_reference=sourceSR) as sourceCursor: for sourceRow in sourceCursor: # create a dictionary for the row for easier field access sourceRowDict = dict(zip(inInterpolationFields + ['SHAPE@'], sourceRow)) # get source geometry sourceGeom = sourceRowDict['SHAPE@'] # if source zone is entirely within the target zone, add the full value if sourceGeom.within(targetGeom): for field in inInterpolationFields: if sourceRowDict[field]: targetRowDict[field] += sourceRowDict[field] else: # calculate the ratio of the intersection area to the source zone area ratio = sourceGeom.intersect(targetGeom, 4).area / sourceGeom.area for field in inInterpolationFields: if sourceRowDict[field]: # add the areally weighted values to the output fields targetRowDict[field] += ratio * sourceRowDict[field] targetCursor.updateRow([targetRowDict[field] for field in inInterpolationFields + ['SHAPE@']]) # copy output target zones to output arcpy.CopyFeatures_management("target_zones", outTargetFeatures) return