Select to view content in your preferred language

Spatial Join script takes hours in Python, 7 minutes in ArcMap

2107
3
03-27-2012 01:05 PM
TurnerNowak
Deactivated User
My spatial join python script used for two polygon features takes over an hour to complete, yet when I run the same spatial join in ArcMap takes about 7 minutes.

Any idea how I could modify the script to speed it up ?

import arcgisscripting

# Create the geoprocessor object
gp = arcgisscripting.create(9.3)
gp.OverWriteOutput = True

# Set the workspace. List all of the folders within
gp.Workspace = "C:\ZP4"
fcs = gp.ListWorkspaces("*","Folder")

#run spatial join on parcel.shp in each folder, then delete original and replace with "parceljoined"
for fc in fcs:
    print fc
    try:
     gp.SpatialJoin_analysis(fc + "\\Parcels.shp", "C:\ESRI\ESRIDATA\USA\usa_zipcodes.shp", fc + "\\Parcelsjoined.shp",)
     gp.CalculateField_management(fc + "\\Parcelsjoined.shp", "SIT_ZIP", "[POSTAL]", "VB", "")
     gp.CalculateField_management(fc + "\\Parcelsjoined.shp", "SIT_CITY", "[CITYNAME]", "VB", "")
     gp.DeleteField_management(fc + "\\Parcelsjoined.shp", "Join_Count;Join_Cou_1;Join_Cou_2;POSTAL")
     gp.Delete_management(fc + "\\Parcels.shp")
     gp.Rename_management(fc + "\\Parcelsjoined.shp", "Parcels.shp")
    except Exception:
     print 'AddZip Error'
Tags (2)
0 Kudos
3 Replies
BruceNielsen
Frequent Contributor
This probably won't speed up your process, but you need to check your file name strings. A single backslash in a Python string acts as an escape character. To be consistant with the rest of your script, change:
gp.Workspace = "C:\ZP4"
and
gp.SpatialJoin_analysis(fc + "\\Parcels.shp", "C:\ESRI\ESRIDATA\USA\usa_zipcodes.shp", fc + "\\Parcelsjoined.shp",)

to

gp.Workspace = "C:\\ZP4"
and
gp.SpatialJoin_analysis(fc + "\\Parcels.shp", "C:\\ESRI\\ESRIDATA\\USA\\usa_zipcodes.shp", fc + "\\Parcelsjoined.shp",)


You also have that extra comma at the end of the SpatialJoin. Is that a typo, or by design?
0 Kudos
KimOllivier
Honored Contributor
Have you indexed everything?
Spatial indexes on shapefiles are not automatic, besides you should have moved decisively to filegeodatabases for analysis like this.
File geodatabases will be much faster.
Where are you scratch folders? They might be different in ArcMap because environments are different.
Do you have a scratch filegeodatabase defined? Not just a temp folder.
Is your default temp folders on a file server instead of local drive.

Some tools are slower in Python, who knows why, but there are ways of making it faster in both systems.
0 Kudos
TurnerNowak
Deactivated User
Well, I took your advice and modified the script. Here is the final result which may be a tad 'faster' :

import arcgisscripting

# Create the geoprocessor object
gp = arcgisscripting.create(9.3)
gp.OverWriteOutput = True

# Set the workspace. List all of the folders within
gp.Workspace = "C:\\ZP4"
gp.ScratchWorkspace = "C:\\ESRI\\temp_output"
fcs = gp.ListWorkspaces("*","Folder")

#run spatial join on parcel.shp in each folder, then delete original and replace with "parceljoined"
for fc in fcs:
    print fc
    gp.AddSpatialIndex_management(fc + "\\Parcels.shp", "0", "0", "0")
    gp.SpatialJoin_analysis(fc + "\\Parcels.shp", 'C:\\ESRI\\ESRIDATA\\USA\\usa_zipcodes.shp', fc + "\\Parcelsjoined.shp", "")
    gp.CalculateField_management(fc + "\\Parcelsjoined.shp", "SIT_ZIP", "[POSTAL]", "VB", "")
    gp.CalculateField_management(fc + "\\Parcelsjoined.shp", "SIT_CITY", "[CITYNAME]", "VB", "")
    gp.DeleteField_management(fc + "\\Parcelsjoined.shp", "Join_Count;Join_Cou_1;Join_Cou_2;POSTAL")
    gp.Delete_management(fc + "\\Parcels.shp")
    gp.Rename_management(fc + "\\Parcelsjoined.shp", "Parcels.shp")
0 Kudos