Zonal Statistics as Table: Overlapping Polygons: Solution Required

24280
63
08-09-2011 07:01 AM
PeterWilson
Occasional Contributor III
I'm currently using ArcGIS 10 SP2 on Windows 7 64bit. I'm working on a very large railway alignment design project and need to generate "Average Catchment Slopes" for each watershed. The problem that I have is that the "Zonal Statistics as Table" tool does not process overlapping polygons. It's mentions in the help that if the polygons overlap, that should be processed interactively to overcome this. I haven't been able to find any supporting documentation to achieve this. Just a note, I have over 600 watersheds that I need to process and a manual process will not be possible if I'm to meet my deadline. Any advice will be appreciated.

Regards
63 Replies
SebastianRoberts
Occasional Contributor III
Jodi,
  I found that it was not a memory leak issue, but as Curtis said:
gp/arcpy scratch name methods can be very slow when the workspace in which you are generating scratchnames gets full of many files

Creating a new workspace for OUTPUT every 30 or so feature classes greatly increased the speed of my script.  You are using listrasters to create the list before you enter the loop. In your loop, you could just change the name of the output workspace after a certain number of iterations. Below is some sample code.  I didn't test it, so may have some errors, but should get you started.
Sebastian


#create scratch workspace name
tempDIR = "F:\\Data\\Reg09"
tempWorkspacePrefix = "update"
tempWorkspaceSuffix = 0

count = 0;
for raster in rasterList:
   if count%30 == 0:     # this is modulus operator, so should be true the first iteration and every 30 iterations thereafter.
      tempWorkspaceSuffix = tempWorkspaceSuffix  + 1
      tempWS = tempWorkspacePrefix + str(tempWorkspaceSuffix)
      if not arcpy.Exists(tempDIR + os.sep + tempWS):
          arcpy.CreateFolder_management(tempDIR, tempWS)
      out = tempDIR + os.sep + tempWS + os.sep
   name = os.path.basename(raster).strip(".tif")  # removes .tif from the file name
   arcpy.env.cellSize = "MINOF"
   arcpy.gp.ZonalStatisticsAsTable_sa(region, "VALUE", raster, out+name+".dbf","DATA", "ALL")
   print" Raster " + str(raster) + " processing completed at " + time.asctime( time.localtime(time.time()) ).
   count = count + 1
0 Kudos
curtvprice
MVP Esteemed Contributor
My python code is below....any suggestions on how to modify this code?
I should add that sometimes this script runs and sometimes not.


What does "sometimes not" mean?  Is there any kind of error message? If ArcMap is crashing (this would be a memory leak), an approach that may help make your script more stable is to use the Python garbage collector:

import gc
gc.enable()


While we're talking code, your use of "strip" may not do exactly what you want. The best way to remove file extensions is the os.path.splitext() function:
os.path.basename(os.path.splitext(path)[0])
0 Kudos
JoannaWhittier
New Contributor III
Thanks to both of you for the quick response.  I incorporated the code provided by Sebastian and I think it is running correctly but I get the same error message as before (see below).  What made me think it could be a memory issue is that the python code would run for a few files and then throw the below error message.  If I delete the partially created table, and close the python shell (am running IDLE, but error also happens when just using command prompt), and the python script - I can reopen the script and run it using the very same file that threw the error.  However, the number of files that were processed continued to decline and now it will not run at all.  It is true that the files are read only but the script was working previously with read only files.

It may help to know that I am calculating zonal stats for NHD catchments by NHD region using the NHD regional catchment grids and a climate raster for the conterminous US - so fairly large spatial summaries.

I'll try the splittext option because you are correct Curtis, the strip option does not always work.

  File "F:\Data\Reg09\attributetifsReg09.py", line 56, in <module>
    arcpy.gp.ZonalStatisticsAsTable_sa(region, "VALUE", raster, out+name+".dbf","DATA", "ALL")

<class 'arcgisscripting.ExecuteError'>: ERROR 999999: Error executing function.
Workspace or data source is read only.
Workspace or data source is read only. [The F:\Data\Reg09\update1\ workspace is read only.]
ERROR 010067: Error in executing grid expression. Zonal statistics program failed
Failed to execute (ZonalStatisticsAsTable).

ERROR 999999: Error executing function.
Workspace or data source is read only.
Workspace or data source is read only. [The F:\Data\Reg09\update1\ workspace is read only.]
ERROR 010067: Error in executing grid expression. Zonal statistics program failed
Failed to execute (ZonalStatisticsAsTable).
0 Kudos
curtvprice
MVP Esteemed Contributor
[The F:\Data\Reg09\update1\ workspace is read only.]
ERROR 010067: Error in executing grid expression.
Failed to execute (ZonalStatisticsAsTable).


It looks to me like you are running into file-locking issues. If you are viewing the workspace in another application (say, you're running the tool in IDLE and viewing the workspace in ArcCatalog or ArcMap), and have a table open, if you try to overwrite it using your script, Windows will not give you security to do so. ArcMap for example has set a lock, and IDLE cannot overwrite the table. 

If you have Windows Explorer open in the folder, you can often see the lock files that Windows writes to keep track of those locks.

These KB articles have some useful information about file locks.

Why am I getting schema locks on shapefiles?
FAQ:  Why do file geodatabase .lock files remain after a process is finished
0 Kudos
JoannaWhittier
New Contributor III
I am still not quite understanding why there appears to be a lock that did not used to happen but have found a clumsy work around.  For whatever reason, my script will run two files and then throw the read-only error.  When I changed Sebastian's code from:

   if count%30 == 0:     # this is modulus operator, so should be true the first iteration and every 30 iterations thereafter.
     

to  if count%2 == 0:

python will continue to run and place 2 zone stats tables in sequentially numbered new folders.  Not a great solution but at least the bleeping thing is running.

Thank you again!  I'll check back in case you think of something else that could lead to the lock issue I have encountered.

Follow up: my 'solution' only worked for about 10 files and the read only error returned.
0 Kudos
HeatherWhitcomb
New Contributor
Hello, I'm a fairly new ArcMap 10 user.  I tried using the StatisticsforOverlappingZones tool that was posted above and I got the following error.  Any thoughts on what I might be doing wrong?  The script ran for more than 4 hours before it returned the error message.  I was hoping the fact that it was running for so long meant that it was actually executing.  Does this look like a memory issue?  I have 448 polygons in my shape file. 

Also, how would I create an index for the join field (I'm assuming that is the FID field) in the join table?

Here was my error message:
Executing: StatisticsForOverlappingZones C:\HeatherData\CHP_inMA.shp FID C:\HeatherData\ndvi_20050512.img C:\HeatherData\Greenness_CHP_inMA.shp
Start Time: Tue May 15 12:10:10 2012
Running script StatisticsForOverlappingZones...
<class 'arcgisscripting.ExecuteError'>: Failed to execute. Parameters are not valid.
WARNING 000970: The join field FID in the join table CHP_inMA is not indexed. To improve performance, we recommend that an index be created for the join field in the join table.
ERROR 000728: Field FID does not exist within table
Failed to execute (AddJoin).

Failed to execute (StatisticsForOverlappingZones).
Failed at Tue May 15 16:34:22 2012 (Elapsed Time: 4 hours 24 minutes 12 seconds)
0 Kudos
HeatherWhitcomb
New Contributor
We recently published a toolbox (first prize at the ESRI UC 2011 AppFair!) that includes tools that do overlap processing for statistics and area tabulations, among other things.

USGS Open-File Report 2010�??1268
National Water-Quality Assessment (NAWQA) Area-Characterization Toolbox
By Curtis V. Price, Naomi Nakagaki, and Kerie J. Hitt

http://pubs.usgs.gov/of/2010/1268/

The published version has only limited support for 10.x, but I have an updated version that will be posted there soon. If you are interested, let me know and I can send it your way.


Hi Curtis,

I am interested in checking out the tool for use with 10.  Is it available on the web?

Heather
0 Kudos
curtvprice
MVP Esteemed Contributor

> I am interested in checking out the tool for use with 10.  Is it available on the web?

A version of the NAWQA toolbox that runs under 10.x is available in ArcGIS Online here:

http://www.arcgis.com/home/item.html?id=cbb59504f59f4e18b23817fb0ef40e56 

0 Kudos
LuciaBarbato
New Contributor III
Hi Curtis,

We've been following this thread. We too, are having issues with zonal statistics as table with overlapping polygons. We have a very large dataset of over 80,000 land use features with about 2,500 overlaps. While the process runs, it does not output results for the overlapping features.  We are using 10.0 SP3. Is there a link to an updated tool that we could download?

Thanks,

Lucia
0 Kudos
curtvprice
MVP Esteemed Contributor
  Is there a link to an updated tool that we could download?


I've been posting updated versions here for review, please use at your own risk and send comments:

ftp://ftpext.usgs.gov/pub/nonvisible/cr/nact
0 Kudos