Select By Location on a large SDE featureclass

3405
1
Jump to solution
05-16-2012 01:57 PM
WesleyMarcell
New Contributor III
I have looked around and have not found any solution to this problem.
I am writing a GP tool that will allow the user to define a point by clicking on a point within a map service.  The user has the ability to pick which feature set they would like to run the tool against.  This tool works in under 10 seconds for every data layer that is in the map service other than the Common Land Unit layer.  This layer is in SDE, has a spatial index, and has 36 million features (rows).

So in arcmap, I can simply use the selection tool > click on a polygon > right click > Selection\Create Layer from selected features and then export this polygon to a new featureclass.  I can do it in about 20 seconds depending on how fast i can click 😛

Using python however this process took 33 minutes to run using the code attached below.  It looks like when I use MakeFeatureLayer_management it creates a copy in memory of the 36 million features.  Then when I do the spatial selection it finds the one record its looking for.  Using print statements, this part of the code does not appear to take very long.  It appears that the part taking a long time is CopyFeatures_management. 

To me it looks like it is using a search cursor to go through the 33million records to find the one that is selected and then exporting it to the scratch shapefile.  This makes no sense to me.  If the script can make a feature layer and do the selection in seconds then why does it take 30 minutes to export a single record?  Are there any environment settings that I need to use?  Is there a way to just use the features of this featureLayer that are in the current extent of the map service?  Is there a better approach than the code below?  Is there something that our SDE Manager should do to the data to make it faster to query on location?

I am thinking if I can do it manually in arcmap in a few seconds then there is something that is completely wrong with my script (or perhaps arcpy).

Any help or feedback would be greatly appreciated!!!


import arcpy, sys from arcpy import env  layerSelection = sys.argv[1] featureSet = sys.argv[2] scratch_ws = arcpy.env.scratchWorkspace arcpy.AddMessage(scratch_ws)  try:     arcpy.env.overwriteOutput = True     arcpy.MakeFeatureLayer_management(layerSelection,'lyr')     arcpy.AddMessage('Running Spatial Intersect...')     arcpy.SelectLayerByLocation_management('lyr', 'INTERSECT', featureSet, 0, r'NEW_SELECTION')         arcpy.AddMessage('Copying Selected Features to Output...')     arcpy.CopyFeatures_management('lyr', scratch_ws + r'\selection.shp')     arcpy.Delete_Management('lyr') except:     print arcpy.GetMessages()
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
WesleyMarcell
New Contributor III
I was able to figure out the solution on my own.  It was quite a wrestling match between myself, arcpy.env, and the scratch workspace but I got a script that will run in 3 seconds pointing at my 36 million feature dataset.  I buffered the point out 100ft and then used a copy features.  Copy features actually respects your environment settings unlike make feature layer.  Basically only features within 100 feet of the point are selected and copied to the temp directory.  This works for my script since I am only looking for one feature to be returned.  If you were looking to do more you could always use an if layerSelection == "CLU" to only do this process on the giant layer...  here is some code goodness (even has metadata!!!)  I hope it helps someone out.

import arcpy, sys from arcpy import env  ##Input Variables and setting scratch workspace variable layerSelection = sys.argv[1] featureSet = sys.argv[2] scratch_ws = arcpy.env.scratchWorkspace arcpy.env.overwriteOutput = True  ##Creating Variables for model derived  Buffer = scratch_ws + r'\scratch.gdb\buffer' TempLayer = scratch_ws + r'\scratch.gdb\layer' selection = scratch_ws + r'\scratch.gdb\selection'  try:     ##Create a buffer of input feature set.  This will be used to set the extent of the model.     ##If this is not used the script will try to process the entire 36 million feature CLU dataset.     ##This works because we are only interested in returning one feature to the geoprocessing service.     arcpy.Buffer_analysis(featureSet, Buffer, "100 feet")     desc = arcpy.Describe(Buffer)     arcpy.env.extent = Buffer      ##Copy the selected features to a temporary layer.  This will respect your environment setting,     ##so only records within 100ft of your point will be written to disk.     arcpy.CopyFeatures_management(layerSelection, TempLayer)      ##Create a temporary feature layer for use in select by location     arcpy.MakeFeatureLayer_management(TempLayer, 'lyr')     arcpy.AddMessage('Running Spatial Intersect...')      ##Select features that intersect the user input point.      arcpy.SelectLayerByLocation_management('lyr', 'INTERSECT', featureSet, 0, r'NEW_SELECTION')      arcpy.AddMessage('Copying Selected Features to Output...')      ##Copy the selected features to the arcgisjobs scratch directory     arcpy.CopyFeatures_management('lyr', selection)      ##Delete Temporary Data     arcpy.Delete_management(Buffer)     arcpy.Delete_management(TempLayer)      ##Will count the number of features in output featureclass     result = arcpy.GetCount_management(selection)      ##If there are no records in the output featureclass a custom error will be returned to the user         if int(result.getOutput(0)) == 0:         arcpy.AddError ("No Features in " + layerSelection + " at selected location")         ##Delete empty featureclass so that the server will retun an empty result         arcpy.Delete_management(selection)     else:         pass  except:     print arcpy.GetMessages()

View solution in original post

1 Reply
WesleyMarcell
New Contributor III
I was able to figure out the solution on my own.  It was quite a wrestling match between myself, arcpy.env, and the scratch workspace but I got a script that will run in 3 seconds pointing at my 36 million feature dataset.  I buffered the point out 100ft and then used a copy features.  Copy features actually respects your environment settings unlike make feature layer.  Basically only features within 100 feet of the point are selected and copied to the temp directory.  This works for my script since I am only looking for one feature to be returned.  If you were looking to do more you could always use an if layerSelection == "CLU" to only do this process on the giant layer...  here is some code goodness (even has metadata!!!)  I hope it helps someone out.

import arcpy, sys from arcpy import env  ##Input Variables and setting scratch workspace variable layerSelection = sys.argv[1] featureSet = sys.argv[2] scratch_ws = arcpy.env.scratchWorkspace arcpy.env.overwriteOutput = True  ##Creating Variables for model derived  Buffer = scratch_ws + r'\scratch.gdb\buffer' TempLayer = scratch_ws + r'\scratch.gdb\layer' selection = scratch_ws + r'\scratch.gdb\selection'  try:     ##Create a buffer of input feature set.  This will be used to set the extent of the model.     ##If this is not used the script will try to process the entire 36 million feature CLU dataset.     ##This works because we are only interested in returning one feature to the geoprocessing service.     arcpy.Buffer_analysis(featureSet, Buffer, "100 feet")     desc = arcpy.Describe(Buffer)     arcpy.env.extent = Buffer      ##Copy the selected features to a temporary layer.  This will respect your environment setting,     ##so only records within 100ft of your point will be written to disk.     arcpy.CopyFeatures_management(layerSelection, TempLayer)      ##Create a temporary feature layer for use in select by location     arcpy.MakeFeatureLayer_management(TempLayer, 'lyr')     arcpy.AddMessage('Running Spatial Intersect...')      ##Select features that intersect the user input point.      arcpy.SelectLayerByLocation_management('lyr', 'INTERSECT', featureSet, 0, r'NEW_SELECTION')      arcpy.AddMessage('Copying Selected Features to Output...')      ##Copy the selected features to the arcgisjobs scratch directory     arcpy.CopyFeatures_management('lyr', selection)      ##Delete Temporary Data     arcpy.Delete_management(Buffer)     arcpy.Delete_management(TempLayer)      ##Will count the number of features in output featureclass     result = arcpy.GetCount_management(selection)      ##If there are no records in the output featureclass a custom error will be returned to the user         if int(result.getOutput(0)) == 0:         arcpy.AddError ("No Features in " + layerSelection + " at selected location")         ##Delete empty featureclass so that the server will retun an empty result         arcpy.Delete_management(selection)     else:         pass  except:     print arcpy.GetMessages()