Sum 6,000+ rasters - Python script

11148
28
02-02-2016 02:05 AM
MathieuLacorde1
New Contributor

Hi there,

I am trying to sum close to 6,000 ESRI grids and found a piece of script on this forum that I have tried to adapt (attached). The script runs fine and prompts every grid but eventually crashes when it reaches the last raster. No output is created.

Would you have any idea as to why it crashes? The script seems to work fine when I test it on 5-10 grids. Could the reason be the 5,000 ESRI grid limit  that can be stored in a single workspace directory (Esri Grid format—Help | ArcGIS for Desktop) ? I did try on less than 5,000 grids but had the same outcome.

I am using ArcGIS 10.2.2 and Python 2.7.5.

Thanks for your help!

Mathieu

0 Kudos
28 Replies
MathieuLacorde1
New Contributor

Hi,

Sorry to only get back to you now,  I just had a chance to look at the script today... I removed the dashes and tried to use shorter pathnames. I ran the script and got an error message:

File "E:/Software/new_script_geonet.py", line 44, in <module>

    main()

  File "E:/Software/new_script_geonet.py", line 26, in main

    cellstat.save(outname)

RuntimeError: ERROR 010240: Could not save raster dataset to E:\BOM\daily_temp\daily_max_temp\MaxTemp.gdb\cellstat1 with output format FGDBR.

Any ideas on what I could modify in the script to get it right?

Thanks!

Mathieu

0 Kudos
DanPatterson_Retired
MVP Emeritus

try a different format... just add .tif to the output name

MathieuLacorde1
New Contributor

Hi there,

Struggling a bit as to where I should add the extension ".tif". I would say:

          outname = os.path.join(res_ws, "cellstat{0}".format(i).tif)

However, I still get an error message, different one though:

File "E:\Software\new_script_geonet_revB.py", line 45, in <module>

    main()

  File "E:\Software\new_script_geonet_revB.py", line 26, in main

    outname = os.path.join(res_ws, "cellstat{0}".format(i).tif)

AttributeError: 'str' object has no attribute 'tif'

0 Kudos
DanPatterson_Retired
MVP Emeritus

Well according to this Cell Statistics—Help | ArcGIS for Desktop

you should be able to save to an Esri grid format (with no file extension given)

But the only example that they give there is one with an img or grid format, but saved to a folder rather than a geodatabase.  So, I don't know why it won't save to the geodatabase but I do know that esri grids don't like anything in their paths other than letters, the underscore and maybe the odd number (not beginning any folder segment), although Xander's example suggests otherwise.  So you can try saving to a folder until Xander Bakker​ can comment

0 Kudos
XanderBakker
Esri Esteemed Contributor

Is it possible to share a small part of the data? It is a bit difficult what is going wrong without being able to see the data.

0 Kudos
XanderBakker
Esri Esteemed Contributor

The line you included doesn't add the ".tif" correctly. Change:

outname = os.path.join(res_ws, "cellstat{0}".format(i).tif)

into

outname = os.path.join(res_ws, "cellstat{0}.tif".format(i))

and be sure that the res_ws points to a folder and not a File Geodatabase.

XanderBakker
Esri Esteemed Contributor

The error refers to not being able to write to the output File Geodatabase. The Help states:

Check that a raster with the same name and format does not already exist in the output location. Also, check the Help for the technical specifications of raster dataset formats to make sure that the expected range of values in the output is compatible with the specified format

Just to be sure "MaxTemp.gdb" is a file geodatabase and you have privileges to write to that location?

And at what moment does the error occur,when writing the first raster to that location or when part of the data has been processed?

You could include "arcpy.env.overwriteOutput = True" at the beging of the scipt (after "import arcpy")

JamesCrandall
MVP Frequent Contributor

res_ws = r'E:\BOM\daily-temperature\daily-maximum-temperature\yourGDB.gdb' # I prefer gdb over folder

I agree.  Not sure how many times it has occurred but simply moving things into a FGDB resolves various issues I've run into.  Pretty much I've just accepted the fact that there's probably a darn good reason why ESRI engineers have fully integrated the fgdb (Default.gdb) into the ArcGIS product, so why not follow?

curtvprice
MVP Esteemed Contributor

Hey James.

The fgdb is a great place to store rasters, but no so efficient for processing.  Many raster tools are behind the scenes writing Esri grids behind the scenes anyway. For heavy raster processing like this I recommend setting the scratch and current workspace to a folder (the same folder) and then writing the output (using .save() or copy raster) to file geodatabase if you want to store it there.

I agree that fgdb is the format of choice for features and tables.

curtvprice
MVP Esteemed Contributor

> The thing is I have to run this script on another 10,000 files and I was hoping to get it to work on the whole workspace.

There is a hard limit of about 3000 to 5000 Esri grids in a folder (See help)​ you linked it but the number is less than 5000 if there are integer grids.

I took a closer look at the script and I see you are adding raster objects together and so on. This can lead to many temporary rasters being written to the scratch workspace. While you're script runs you can probably watch them being created. Given the number of rasters you are throwing at this, I can see you hitting that 10,000 limit in your scratch workspace. Deleting raster objects with del may delete some of those rasters in your loop, but using the CellStatistics command with DATA as suggested by Xander  is a much better option and undoubtedly much more efficient than your original approach.

In general this is a lot of files to put in one folder regardless of format. If the folder must be navigated, you will spend a lot of time waiting in Catalog or Explorer for the folder to be listed. It's really better practice to organize your data so you have fewer files split into multiple folders (or workspaces). You can then generate a really long list using arcpy.da.walk() and split the list into chunks for your Cell Statistics operation.