Multi batch spatial joins

2722
14
06-30-2021 03:35 PM
2Quiker
Occasional Contributor II

I have over 20 layers that I need to spatially join, doing separate arcpy.SpatialJoin_analysis for each one would be time consuming.  fc1 is the layer I want to multi batch the other layers in the database to.

I have tried the following but it only spatially joins the first layer in the database. Is it possible to do a multi batch spatial joins? I am going about it the wrong way?

 

 

import arcpy, os  
from arcpy import env
from datetime import datetime as d
startTime = d.now()

arcpy.env.workspace = r"C:\Temp\Default2.gdb"  
fc1 = r"C:\Temp\Default2.gdb\Parcels"

arcpy.env.overwriteOutput = True
fcs = arcpy.ListFeatureClasses() #gets the feature classes
for fc in fcs:
    print(fc)
    arcpy.SpatialJoin_analysis(fc1,fc, "SP_Test", "JOIN_ONE_TO_ONE", "KEEP_ALL", "", "INTERSECT", "", "")
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')

 

fc1 

0 Kudos
14 Replies
by Anonymous User
Not applicable
If you just need certain fields from the other featureclasses, I have a few lines of code that I use in a process to field map the desired fields and skip the rest to keep the target fc slim throughout 14 joins in a loop. If you want to see it as an idea I can post it.
0 Kudos
2Quiker
Occasional Contributor II

If you don't mind sharing that would be great. I hate and think the way ESRI's traditional field mapping is so counterintuitive.

0 Kudos
DavidPike
MVP Frequent Contributor

it looks like SP_test is being overwritten each time.  I would just produce all the joins in memory then merge.

import arcpy, os  
from arcpy import env
from datetime import datetime as d
startTime = d.now()

arcpy.env.workspace = r"C:\Temp\Default2.gdb"  
fc1 = r"C:\Temp\Default2.gdb\Parcels"

arcpy.env.overwriteOutput = True
fcs = arcpy.ListFeatureClasses() #gets the feature classes
counter = 1
for fc in fcs:
    print(fc)
    arcpy.SpatialJoin_analysis(fc1,fc, "SP_Test" + str(counter), "JOIN_ONE_TO_ONE", "KEEP_ALL", "", "INTERSECT", "", "")
    counter += 1
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')
2Quiker
Occasional Contributor II

When you say 'all joins' do you meaning making 20+ separate spatial joins or

arcpy.SpatialJoin_analysis(fc1,fc, "memory\SP_Test" + str(counter), "JOIN_ONE_TO_ONE", "KEEP_ALL", "", "INTERSECT", "", "")

 

0 Kudos
2Quiker
Occasional Contributor II

ok, I see what the "Counter" is doing (SP_Test1, SP_Test2, SP_Test3). The Fc1 layer has 95K features after merging all there is 361,320k features in the merged layer. I must be missing something?

0 Kudos
DavidPike
MVP Frequent Contributor

Yes sorry, merge actually makes no sense, I'm not sure why I thought it would work.

The below code should keep writing the outputs into your workspace, with the last feature class being the one you need e.g. "SP_Test20".  You may want to adapt it to process these in memory and then return only the final feature.

Also as Blake say's - remove fc1 from your workspace and put it into a separate geodatabase, as it will be iterating over itself (or put some logic i.e. if fc !=... : , but I would say the best thing to do is just take it out of there).

This is very simple and won't account for removing shape_length, shape_area etc. or any field mappings and join rules, it simply returns the first matching feature.

import arcpy, os  
from arcpy import env
from datetime import datetime as d
startTime = d.now()

arcpy.env.workspace = r"C:\ArcGISHome\multipleSpatialJoin\b.gdb" 
fc1 = r"C:\ArcGISHome\multipleSpatialJoin\a.gdb\a"

arcpy.env.overwriteOutput = True
fcs = arcpy.ListFeatureClasses() #gets the feature classes
counter = 1
for fc in fcs:
    print(fc)
    out_fc = "SP_Test" + str(counter)
    arcpy.SpatialJoin_analysis(fc1,fc, out_fc, "JOIN_ONE_TO_ONE", "KEEP_ALL", "", "INTERSECT", "", "")
    counter += 1
    fc1 = out_fc
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')
0 Kudos
2Quiker
Occasional Contributor II

I have tried to implement some field mapping to change some field names and it works on the first one but then errors. I am  assuming that it is because it's is trying to change the field again? How can I by pass this on the second go around?

 

Error line 33.

FieldMappings: Error in getting field map from field mapping for GetFieldMap

 

import arcpy, os  
from arcpy import env
from datetime import datetime as d
startTime = d.now()

###Working 7/2/2021
arcpy.env.overwriteOutput = True

arcpy.env.workspace = r"C:\Temp\Default.gdb"  
fc1 = r"C:\Temp\fc1.shp"


fcs = arcpy.ListFeatureClasses() #gets the feature class from database
counter = 1

for fc in fcs:
###Field Mapping
    fieldmappings = arcpy.FieldMappings()

    # Add all fields from inputs.
    fieldmappings.addTable(fc1)
    fieldmappings.addTable(fc)

    # Name fields you want. Could get these names programmatically too.
    keepers = ["Field1", "Field2", "Field3", "Field4","Field5", "Field6"] # etc.

    # Remove all output fields you don't want.
    for field in fieldmappings.fields:
        if field.name not in keepers:
            fieldmappings.removeFieldMap(fieldmappings.findFieldMapIndex(field.name))
    
    # Rename fields
    fieldmap = fieldmappings.getFieldMap(fieldmappings.findFieldMapIndex("DXF_LAYER"))
    field.name = "Field1A"
    field.type = 'Text'
    #field.aliasName = "mean_city_pop"
    
    fieldmap.outputField = field
    #fieldmap.mergeRule = "Count"
    fieldmappings.replaceFieldMap(fieldmappings.findFieldMapIndex("Field1"),fieldmap)
    ####
    ####
    
    #print(fc)
    out_fc = "SP_Test" + str(counter)

    spjoin= arcpy.SpatialJoin_analysis(fc1,fc, out_fc, "JOIN_ONE_TO_ONE", "KEEP_ALL", fieldmappings, "INTERSECT", "", "")
    counter += 1
    fc1 = out_fc
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')

 

0 Kudos
BlakeTerhune
MVP Regular Contributor

Spatial join is an inherently intensive operation and there's not much you can do to improve that. Indexes, multithreading, maybe using memory workspace. The looping is how I would go about doing it. You could consider using field mappings when you copy the data to memory and keep only the necessary fields but that's a lot more work because you have to have a list of keep fields for every feature class.

Also, looking at your code, it looks like you'll end up spatial joining fc1 to its self when it gets to that in the fcs list.

2Quiker
Occasional Contributor II

I guess I am confused about the who process. I was thinking./hoping I could do a spatial join all at once and spatial join would add the new fields from every feature class to the end of fc1. But it sounds like it will create a new feature class for every feature class in the database, maybe I am better off just doing individual spatial joins so I can do the field mapping.

 

import arcpy, os  
from arcpy import env
import traceback

from datetime import datetime as d
startTime = d.now()

arcpy.env.workspace = r"C:\Temp\Default2.gdb"  
fc1 = r"C:\Temp\Taxparcels1.shp"

arcpy.env.overwriteOutput = True
fcs = arcpy.ListFeatureClasses() #gets the feature class names 

# Creating a for loop to iterate over featureclasses
for i in range(0,len(fcs)):
    fcs_s = fcs[i]
    print (fcs_s)
    arcpy.SpatialJoin_analysis(fc1,fcs_s, "SP_Test", "JOIN_ONE_TO_ONE", "KEEP_ALL", "", "INTERSECT", "", "")

 

0 Kudos