Optimizing SelectLayerByLocation_management?

JamesCrandall · ‎06-27-2016

I'm looking for any alternatives to enhancing an existing SelectByLocation process. The most optimized/best performer so far has been to simply put a FeatureLayer and a FeatureClass into an INTERSECT as parameters in an arcpy.SelectByLocation_management method.

1 polygon in the selection feature class (an in_memory FC).

>10million points in a target point feature layer (a FC in a FGDB on a network location registered with ArcGIS Server)

The extent of the polygon feature encapsulates about 2.5 million point features and executes in about 100 seconds. Not bad I guess if we're running on ArcGIS desktop but this is part of a script that is a published Geoprocessing Service and I'm looking to optimize as much as possible to keep the GP service in a synchronous state rather than completing as an asynchronous job.

Thanks for any input!

ply_fc = r'H:\MyFolder\gpdata.gdb\WM_InputPly_0624'
pnt_fc = r'H:\MyFolder\gpdata.gdb\WM_InputPts_0624'
arcpy.MakeFeatureLayer_management(pnt_fc, "PointFeatures")

#perform the selection 
arcpy.SelectLayerByLocation_management("PointFeatures", "INTERSECT", ply_fc)
rowcount = int(arcpy.GetCount_management("PointFeatures").getOutput(0))

JoshuaBixby · ‎06-28-2016

What type of geographic extent are we talking about, i.e., state, region, nation, or hemisphere? What do the points represent? Is there potentially a better way to model the data than using geometric/geographic points?

All things considered, the times you state don't seem terrible, although they don't seem great either. You mention the FGDB is stored on a network location. Just for experimentation's sake, if you move the points to local disk(s), does it reduce the time? If the points are in-memory, does it reduce the time? It would be good to sort out how the different storage types (network disk, local disk, in-memory) affect the selection times.

JamesCrandall · ‎06-28-2016

Max extent (in Web Mercator Aux Sphere spatial ref): 40,000 acres

My initial recommendation was to re-work the point feature class for those larger scales so as to reduce the density, and ultimately the processing times for the Selection. But I hoped to fully vet this out before going down that road.

When I look at the AGS logs, all seems to process ok 200 codes with this published GP service. Looking at developer tools, it says about 225,000 milliseconds or about 3.75 mins to process that 40k acre polygon on points selection.