TIP: Simple code to make IDENTITY or ERASE or Other Overlay Functions much faster

Discussion created by bmillerdot on Jan 15, 2014
Latest reply on Jan 16, 2014 by csny490
I'm not sure if this is a good idea to post but maybe it will help someone.
Sometimes ArcMap will take a very long time to complete an identity.
  Streams - about five million lines (All lines are single part and reasonable length)
  Lakes - about ten thousand polygons (All polygons are single part and reasonable area)

The processing time for this identity was over a day before I stopped it.
arcpy.Identity_analysis("Streams", "Lakes", "StreamWithLakeId")

Using the following code reduced this time to about ten minutes.
def FastIdentity(inFL, idFL, outFC):  # Input must be Feature Layers
    arcpy.SelectLayerByLocation_management  (inFL, 'INTERSECT', idFL)
    arcpy.Identity_analysis                 (inFL, idFL, "in_memory/Flow0") #or ERASE
    arcpy.MultipartToSinglepart_management  ("in_memory/Flow0", outFC)      #OPTIONAL
    arcpy.Delete_management                 ("in_memory/Flow0")
    arcpy.SelectLayerByAttribute_management (inFL, "SWITCH_SELECTION")
    arcpy.Append_management                 (inFL, outFC, "NO_TEST","","")

arcpy.env.workspace = "C:/somepath.gdb"
arcpy.MakeFeatureLayer_management("Lakes",   "LakeLayer")
arcpy.MakeFeatureLayer_management("Streams", "StreamLayer")

FastIdentity("StreamLayer", "LakeLayer", "C:/out.gdb/OutStreams")

Check the following before using this method:

1) Set the join_attributes fields which could also be done with fieldmapping.
2) Maybe add the cluster tolerance to Identity.
3) Erase or other analysis overlay tools can also be done faster by using the same type of code.
4) Skip the MultipartToSinglepart if multiparts are needed.

NOTE: I only have access to 10.1 so this might be better in 10.2 or maybe I'm doing something wrong.
Also, I never did try using "in_memory" with the original Identity so that might have fixed the problem with less code.