Zonal Statistics as Table: Overlapping Polygons: Solution Required

PeterWilson · ‎08-09-2011

I'm currently using ArcGIS 10 SP2 on Windows 7 64bit. I'm working on a very large railway alignment design project and need to generate "Average Catchment Slopes" for each watershed. The problem that I have is that the "Zonal Statistics as Table" tool does not process overlapping polygons. It's mentions in the help that if the polygons overlap, that should be processed interactively to overcome this. I haven't been able to find any supporting documentation to achieve this. Just a note, I have over 600 watersheds that I need to process and a manual process will not be possible if I'm to meet my deadline. Any advice will be appreciated.

Regards

curtvprice · ‎09-12-2011

I got an error at first that I had no FID field, my zonal feature had ObjectID instead of FID fields,...

In Jamie's defense, this was kind of a once-off deal working with shapefiles. The more fancy way to do this (to support tables with different object ID field names) is:

instead of hardcoding:

strFID = '"FID" ='
...
    arcpy.SelectLayerByAttribute_management(zone, newSelection, strFID + str(inRow.FID))

get the tables object id field name:

strFID = arcpy.Describe(inputFeatureZone).OIDFieldName
...
    strQry = '"%s" = %s' % (strFID, inRow.GetValue(strFID))
    arcpy.SelectLayerByAttribute_management(zone, newSelection, strQry)

DanielSheehan · ‎09-15-2011

Okay, Thanks!!! I'm batch processing a ton of radial and network buffers. Should I be running this in python rather than in Modelbuilder?

curtvprice · ‎09-15-2011

@Daniel -

This is just a lot of iterations and will take quite a while. I doubt if wrapping the Python script inside modelbuilder will buy you any speed, as long as you've checked the box to run the python script "in process". One approach to consider is grouping your input polys into some non-overlapping groups so you can them in parallel with a single run of zonal statistics.

I have a script tool that makes an attempt but it's not good enough for me to feel okay putting it here. If interested in helping me with that project by testing it, contact me off line.

DanielSheehan · ‎09-15-2011

Curtis, I think I'm using Jamie's script which I benchmarked it in modelbuilder and would take many days to run (if my computer could even chug along for that long). I was just wondering if it would be significantly faster to run things outside of modelbuilder as a .py file vs. running in modelbuilder?

The first workaround I came up with was to run zonal stats on each Feature split into its own Feature Class (so whereas one buffer distance Feature Class had 1541 records, I now have 1541 Feature Classes). Then I set up Iterator in modelbuilder to create Rasters for each FC (n=1541) (did this b/c I guess Zonal Stats converts vector to raster temporarilty to run) and then run through all the Rasters Zonal Stats individually using iterator. This worked until I ran into a few rasters that didn't produce any stats (why I don't know) and caused the program to crash - I'm curious if iterators in modelbuilder have a theoretical limit.

So now I'm thinking of just converting my Kernel Density surface (the file I want to do Zonal Stats for) to a point file and just using Summary Statistics after an intersect of polygons to points. I'm just exhausted using Zonal Stats for 2 reasons, it keeps crashing using iterators AND 13-character limits for file names (I had to reprocess work to create unique IDs that took a whole day b/c my unique IDs were 15 characters long).

Do you see any reason not to do this vector version of Zonal Stats? And how stable is your tool that you don't feel comfortable putting up on the forum? I may not be of help testing b/c I'm not much of a python programmer (I'm self taught modelbuilder and know very little python and even less VBA).

Danny

Danny

SebastianRoberts · ‎11-18-2011

Just found this post. Jamie's script worked great. I am using this for zonal stats in a parcel model so only have a small subset of overlapping tax parcels (condos only). In my case, I do an intersect with just one input (the parcel layer) to extract only the overlapping condo tax parcels. I added this to the beginning of Jamie's script, then process only these parcels with the script. In a second model, I do a selection to subtract these condo tax parcels form the county wide parcel feature layer. I can then do a regular zonal statistics on the feature layer for all the remaining non overlapping polygons. Just thought I would share this as a way to speed up the process if your dataset has a small subset of overlapping polygons.

#create overlapping polygons with intersect tool (intersect with self to get overlapping)
arcpy.Intersect_analysis([inputParcelLayer],FeatureZone,"NO_FID","","")

jillianthonell · ‎11-21-2011

Hi Curtis,

It is the garbage collection module. We were running into some memory issues after several iterations of the tool. The garbage collection module clean up unreferenced objects in memory. Take a look t this link.

http://docs.python.org/library/gc.html

Hi Jamie

I was interested to see this thread as I have written a script to handle overlapping polygons for zonalstatistics and have been experiencing similar memory issues with iterations and subsequent crashes. I enabled the garbage collection as in your script, but am still have issues. Can you suggest any further solutions?

Thanks
Jill

SebastianRoberts · ‎02-17-2012

Jill,
Did you find any solution? As we have added more overlapping parcels to our feature class (we now have 860), I too seem to be running into memory issues. My script was crashing, so I added another loop that uses a smaller cursor and repeatedly deletes the searchCursor. It no longer crashes, but interestingly enough, a "zone" is processed in about 20 seconds at the outset, and then towards the end of the 800 features, it slows down to about 5 minutes to process one zone. Below is the python code that I am using.

#create overlapping polygons with intersect tool (intersect with self, only creates partial parcels where overlapping)
arcpy.Intersect_analysis([inputParcelLayer],FeatureIntersect,"NO_FID","","")

#make feature layer, join with self intersected parcels, use result as overlapping parcels for zonal stats
arcpy.MakeFeatureLayer_management(inputParcelLayer,"parcelsoverlap")
arcpy.AddJoin_management("parcelsoverlap","APN",FeatureIntersect,"APN","KEEP_COMMON")
arcpy.CopyFeatures_management("parcelsoverlap", FeatureZone)


tempDIR = arcpy.GetSystemEnvironment("TEMP")
#create scratch workspace name
tempWorkspace = "tempWorkspaceForZonalStats"

if not arcpy.Exists(tempDIR + os.sep + tempWorkspace):
    arcpy.CreateFolder_management(tempDIR, tempWorkspace)
arcpy.env.workspace = tempDIR + os.sep + tempWorkspace

#input variables
strFIDname = arcpy.Describe(FeatureZone).OIDFieldName
zone = "zone"

arcpy.env.cellSize = 2 #needed to reduce cell size because zonal stats crashes if the polygon is smaller than raster resolution
gc.enable()  #enable garbage collection
arcpy.CheckOutExtension("Spatial")
cellAssign = "CELL_CENTER"
priority = "NONE"
newSelection = "NEW_SELECTION"

# Make a layer from the feature class
arcpy.MakeFeatureLayer_management(FeatureZone, zone) 
#Loop through features
numFeats = int(arcpy.GetCount_management(FeatureZone).getOutput(0))
groups = []
bracket = 30
while(bracket < numFeats):
 groups.append(bracket)
 bracket = bracket + 30
groups.append(numFeats+1)
for x in groups:
    whileClause = '"OBJECTID" >= ' + str(x-30) + ' and "OBJECTID" < ' + str(x)
    print whileClause
    inRows = arcpy.SearchCursor(FeatureZone,whileClause)
    #inRows = arcpy.SearchCursor(FeatureZone)
    inRow = inRows.next()
    while inRow:
        #selct the fid to process
        strID = inRow.getValue(strFIDname)
        print ' processing objID: ' + str(strID) + 'of ' + str(numFeats) + ' total features.'
        strQry = '"%s" = %s' % (strFIDname, strID)
        arcpy.SelectLayerByAttribute_management(zone, newSelection, strQry)
        #create a unique name for the zone
        uniqueZone = arcpy.CreateUniqueName("zone.shp", arcpy.env.workspace)
        uniqueTable = arcpy.CreateUniqueName("ZSasT", arcpy.env.workspace)
        #create a temporary feature to use for zonal stats as table
        arcpy.CopyFeatures_management(zone, uniqueZone)
        outZSaT = ZonalStatisticsAsTable(uniqueZone, joinField, valueRaster, uniqueTable, "NODATA", "ALL")

        #move to next record.
        inRow = inRows.next()
    del inRow
    del inRows

curtvprice · ‎02-17-2012

Interestingly enough, a "zone" is processed in about 20 seconds at the outset, and then towards the end of the 800 features, it slows down to about 5 minutes to process one zone.

My experience is that the gp/arcpy scratch name methods can be very slow when the workspace in which you are generating scratchnames gets full of many files. This can slow down all tools!

You may get better performance if you just create an empty folder/workspace and generate names using sequence numbers. I have gotten in the habit with my general purpose script tools (like the [post=124423] NAWQA toolbox[/post]) of creating a temporary folder and workspace so I don't need to worry about file name collisions and can avoid waiting for the scratchname method!

SebastianRoberts · ‎02-17-2012

Hi Curtis,
Thanks for your reply. That improved performance greatly. I just moved my temp directory creation inside the loop so that a new directory is created for every loop iteration (every 30 feature clasess), and now it is on target to finish in about 3 hours rather than 30.

Sebastian

for x in groups:
    #create the tempworkspace
    tempWS = tempWorkspacePrefix + str(tempWorkspaceSuffix)
    if not arcpy.Exists(tempDIR + os.sep + tempWS):
        arcpy.CreateFolder_management(tempDIR, tempWS)
    arcpy.env.workspace = tempDIR + os.sep + tempWS
    while......................(rest of the loop as in my previous post)
tempWorkspaceSuffix = tempWorkspaceSuffix + 1;

JoannaWhittier · ‎04-11-2012

I am running into memory issues as described in this thread. I see where Curtis and Sebastian recommend and provide coding to create a temporary folder/workspace but am not sure how to work that into the python code I am using. To use ListRasters, I had the impression that you needed to set the env.workspace to the folder containing the raster files. My python code is below....any suggestions on how to modify this code?

I should add that sometimes this script runs and sometimes not. I think the issue is memory but just saw something about a search curser. Not sure what that is.

Thank you,
Jodi

import arcpy, os, tempfile, shutil, sys, time, traceback
from arcpy import env
from arcpy.sa import *

# Check out any necessary licenses
arcpy.CheckOutExtension("spatial")

# Set the current workspace, this is folder that contains the .tif files for climate variables
arcpy.env.workspace = "F:\\Data\\Reg09\\tifs"

# Set Geoprocessing environments
arcpy.env.overwriteOutput = True

#Set Local variables
region =('F:\\Data\\NHD\\grids\\reg09_lambi')  # this is the folder with the NHD catchment grid (raster)
    
out = "F:\\Data\\Reg09\\update\\"
    
rasterList = arcpy.ListRasters("*", "TIF")

    
for raster in rasterList:
   name = os.path.basename(raster).strip(".tif")  # removes .tif from the file name
   arcpy.env.cellSize = "MINOF"
   arcpy.gp.ZonalStatisticsAsTable_sa(region, "VALUE", raster, out+name+".dbf","DATA", "ALL")
   print" Raster " + str(raster) + " processing completed at " + time.asctime( time.localtime(time.time()) )