Arcpy NetCDF cache problems

666
3
Jump to solution
07-11-2012 07:56 AM
ChristopherStrother
New Contributor II
I am processing 9 tiles (.nc files) with 27 years of records each. This modified script runs fine for a while and then gives a Failure to Execute. I assume that the arcpy cache is reaching a limit of some sort. I have searched the forums and most point me to numpy or netcdf4 modules that I can't understand. Can someone tell me a simpler way to modify this code to address the problem?

#Convert NetCDF files to Imagine img files #Jason Geck jgeck@alaskapacific.edu #Make a separate raster *.img file for each dimension value (time) in a netcdf file # 2012.07.03 Modified by Sergio Bernardes/Chris Strother  import arcpy, os, time, datetime, calendar  # Set local variables arcpy.env.overwriteOutput = True arcpy.env.scratchWorkspace = "C:\\Scratch\\"   path = "C:\\PRCP\\" path_rasters = "C:\\PRCPRasters\\"  extension = ".nc"  dirList=os.listdir(path)  #Loops thru all days of year (including leap days) #Year info Yr = range (1982,2008) #Month info allmnths = range(1,13)  for fname in dirList:     if fname.lower().endswith(extension):          tile, Yr, datatype = fname.split("_")         intYr = int(Yr)                  #print tile, yr, datatype              for mnths in allmnths:             Lastday = calendar.monthrange(intYr, mnths)[1]             MRange = range(1,Lastday+1)             for dyy in MRange:                 dyys =int(dyy)                 mnthsss = int(mnths)                 yrs = int(Yr)                 stringMonth = str(mnthsss)                 stringDays = str(dyys)                                  a = stringMonth.zfill(2) + "/" + stringDays.zfill(3)+"/"+Yr                 b =  stringMonth.zfill(2) + "_" + stringDays.zfill(3) +"_"+Yr                                  print a,b                                  inDate = "time " + a                 prcp_nc = path + fname                  prcp_Layer = "prcp_Layer" + b                 test_img = path_rasters + tile + "_" + b + ".img"                  # Process: Make NetCDF Raster Layer                 arcpy.MakeNetCDFRasterLayer_md(prcp_nc, "prcp", "x", "y", prcp_Layer, "", inDate, "BY_VALUE")                 print "Created NetCDF Layer for " + prcp_Layer                                              # Process: Copy Raster                 arcpy.CopyRaster_management(prcp_Layer, test_img, "", "", "", "NONE", "NONE", "")                 print "Created Raster for " + prcp_Layer                   print "------------- finished"
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
PhilMorefield
Occasional Contributor III
Chris-

After looking over your script and the Daymet files, I'm going back on what I said. Going outside of ArcGIS really is the best solution (IMHO) for bulk processing netCDF data, but would require more background and a more complex script to handle all of the calendar/date information.

To start with, keep an eye on the syntax for the arcpy.MakeNetCDFRasterLayer_md. I've highlighted a change in blue that makes the input match the documentation. But when you export to Python snippet, it shows the syntax as you already had it. I'm not sure which one is 'correct'.

Here's what I think the problem is: you're creating a new Raster Layer ('prcp_Layer') at each iteration, but never deleting them. I think ArcMap and/or your RAM are getting clogged up with files in memory, which is all a Raster Layer is. So your original hunch may be correct!

I've made some changes in red to your original script which tidies things up. I've also put everything in side of a try/except clause, so that we can get more information if you continue to have problems.

#Convert NetCDF files to Imagine img files #Jason Geck jgeck@alaskapacific.edu #Make a separate raster *.img file for each dimension value (time) in a netcdf file # 2012.07.03 Modified by Sergio Bernardes/Chris Strother  import arcpy, os, time, datetime, calendar, traceback, sys  # Set local variables arcpy.env.overwriteOutput = True arcpy.env.scratchWorkspace = "C:\\Scratch\\"   path = "C:\\PRCP\\" path_rasters = "C:\\PRCPRasters\\"  extension = ".nc"  dirList=os.listdir(path)  #Loops thru all days of year (including leap days) #Year info Yr = range (1982,2008) #Month info allmnths = range(1,13)  try:     for fname in dirList:         if fname.lower().endswith(extension):              tile, Yr, datatype = fname.split("_")             intYr = int(Yr)              #print tile, yr, datatype              for mnths in allmnths:                 Lastday = calendar.monthrange(intYr, mnths)[1]                 MRange = range(1,Lastday+1)                 for dyy in MRange:                     dyys =int(dyy)                     mnthsss = int(mnths)                     yrs = int(Yr)                     stringMonth = str(mnthsss)                     stringDays = str(dyys)                      a = stringMonth.zfill(2) + "/" + stringDays.zfill(3)+"/"+Yr                     b =  stringMonth.zfill(2) + "_" + stringDays.zfill(3) +"_"+Yr                      print a,b                      inDate = "time " + a                     prcp_nc = path + fname                     prcp_Layer = "prcp_Layer" + b                     test_img = path_rasters + tile + "_" + b + ".img"                      # Process: Make NetCDF Raster Layer                     arcpy.MakeNetCDFRasterLayer_md(prcp_nc, "prcp", "x", "y", prcp_Layer, "", [["time", inDate]], "BY_VALUE")                     print "Created NetCDF Layer for " + prcp_Layer                      temp = arcpy.Raster(prcp_Layer)  temp.save(test_img)  arcpy.Delete_management(prcp_Layer)  del temp                      # Process: Copy Raster                     # arcpy.CopyRaster_management(prcp_Layer, test_img, "", "", "", "NONE", "NONE", "")                     print "Created Raster for " + prcp_Layer      print "------------- finished" except:  tbinfo = traceback.format_tb(sys.exc_info()[2])  print "Traceback Info:\n"  for item in tbinfo:  print item + "\n"  print "Error Info:\n{0}: {1}\n".format(sys.exc_type, sys.exc_value)

View solution in original post

0 Kudos
3 Replies
PhilMorefield
Occasional Contributor III
Chris-

First off, if netCDF data are something you'll be working with regularly, I would strongly urge you to spend some time learning how to use a netCDF module in Python. There are several, some of which work fine outside of ArcGIS but cannot be used in script tools and such. Crunching many/large netCDF files with arcpy is clunky, at best (sorry ESRI!). Let me know if you decide to go that route and I can save you a lot of time figuring out where to start.

More than likely, your problem is occurring when you make your calls to arcpy. I would use print statements to check each and every one of the input parameters to those tools and make sure your variables are correct.

How large are your netCDF files? Unless they are extremely resolved spatially, I doubt it's a memory problem.

If you don't find anything there, pin down what line is throwing the error. Are any output files being produced and, if so, are they correct?
0 Kudos
ChristopherStrother
New Contributor II


How large are your netCDF files? Unless they are extremely resolved spatially, I doubt it's a memory problem.

If you don't find anything there, pin down what line is throwing the error. Are any output files being produced and, if so, are they correct?


Phil,
Each file is around 75MB. Output files are being produced and they are correct (.img rasters). The reason we think it's a memory problem is that the process dies each time at a different file. The initial code included the creation of .lyr files as well. Once we deleted that process, it ran longer before terminating.

Our goal is to extract the days with precip >= 25 mm from 1982 to 2008 for 9 different spatial tiles (from Daymet data source). I have 243 .nc files that have to be queried and I was trying to get the rasters created so that I could do something in Model Builder to query them.

If you think a NetCDF module would work best for this and can point me in the direction of one that could be used by someone with limited Python experience, I would appreciate it!
0 Kudos
PhilMorefield
Occasional Contributor III
Chris-

After looking over your script and the Daymet files, I'm going back on what I said. Going outside of ArcGIS really is the best solution (IMHO) for bulk processing netCDF data, but would require more background and a more complex script to handle all of the calendar/date information.

To start with, keep an eye on the syntax for the arcpy.MakeNetCDFRasterLayer_md. I've highlighted a change in blue that makes the input match the documentation. But when you export to Python snippet, it shows the syntax as you already had it. I'm not sure which one is 'correct'.

Here's what I think the problem is: you're creating a new Raster Layer ('prcp_Layer') at each iteration, but never deleting them. I think ArcMap and/or your RAM are getting clogged up with files in memory, which is all a Raster Layer is. So your original hunch may be correct!

I've made some changes in red to your original script which tidies things up. I've also put everything in side of a try/except clause, so that we can get more information if you continue to have problems.

#Convert NetCDF files to Imagine img files #Jason Geck jgeck@alaskapacific.edu #Make a separate raster *.img file for each dimension value (time) in a netcdf file # 2012.07.03 Modified by Sergio Bernardes/Chris Strother  import arcpy, os, time, datetime, calendar, traceback, sys  # Set local variables arcpy.env.overwriteOutput = True arcpy.env.scratchWorkspace = "C:\\Scratch\\"   path = "C:\\PRCP\\" path_rasters = "C:\\PRCPRasters\\"  extension = ".nc"  dirList=os.listdir(path)  #Loops thru all days of year (including leap days) #Year info Yr = range (1982,2008) #Month info allmnths = range(1,13)  try:     for fname in dirList:         if fname.lower().endswith(extension):              tile, Yr, datatype = fname.split("_")             intYr = int(Yr)              #print tile, yr, datatype              for mnths in allmnths:                 Lastday = calendar.monthrange(intYr, mnths)[1]                 MRange = range(1,Lastday+1)                 for dyy in MRange:                     dyys =int(dyy)                     mnthsss = int(mnths)                     yrs = int(Yr)                     stringMonth = str(mnthsss)                     stringDays = str(dyys)                      a = stringMonth.zfill(2) + "/" + stringDays.zfill(3)+"/"+Yr                     b =  stringMonth.zfill(2) + "_" + stringDays.zfill(3) +"_"+Yr                      print a,b                      inDate = "time " + a                     prcp_nc = path + fname                     prcp_Layer = "prcp_Layer" + b                     test_img = path_rasters + tile + "_" + b + ".img"                      # Process: Make NetCDF Raster Layer                     arcpy.MakeNetCDFRasterLayer_md(prcp_nc, "prcp", "x", "y", prcp_Layer, "", [["time", inDate]], "BY_VALUE")                     print "Created NetCDF Layer for " + prcp_Layer                      temp = arcpy.Raster(prcp_Layer)  temp.save(test_img)  arcpy.Delete_management(prcp_Layer)  del temp                      # Process: Copy Raster                     # arcpy.CopyRaster_management(prcp_Layer, test_img, "", "", "", "NONE", "NONE", "")                     print "Created Raster for " + prcp_Layer      print "------------- finished" except:  tbinfo = traceback.format_tb(sys.exc_info()[2])  print "Traceback Info:\n"  for item in tbinfo:  print item + "\n"  print "Error Info:\n{0}: {1}\n".format(sys.exc_type, sys.exc_value)
0 Kudos