AnsweredAssumed Answered

Multiprocessing with Spatial Analyst

Question asked by dwightlanier on May 22, 2013
Latest reply on Jun 2, 2016 by Hornbydd
Hi all,

A few questions on using multiprocessing with SA tools...

I am using ArcGIS 10.0, Windows 7, SP3...and am fine running the python script standalone outside of ArcGIS, no need to make a tool out of it.

First, when setting up the basic structure of the script, where should I make the import calls for SA? Specifically the from arcpy.sa import *? Is this something done once with other import calls like sys, os, etc., or does it need to be done within the worker module? And what about checking out the extension [arcpy.CheckOutExtension("Spatial")], should it be done in main or within the worker module?

Will any of this even work in the multiprocessing environment if there are many processes seeking to check out the SA extentsion at the same time?

If this is possible, my simple understanding of the setup would look something like this:

# Import modules... import arcpy from arcpy.sa import * import multiprocessing import os  # Define the worker function... def calcVS(vsFC, elevationRas, outputWS, count):          arcpy.CheckOutExtension("Spatial")      scratchWS = os.path.join(outputWS, "VS_%s" % (count))     os.mkdir(scratchWS)     arcpy.env.scratchWorkspace = scratchWS      outVS = arcpy.sa.Viewshed(elevationRas, vsFC)     outVS.save("vs_%s.img" % (count))      return "Completed: vs_%s" % (count)  if __name__ == "__main__":          # list of shapefiles to be used as observers,     # pretend this is populated...     fcs = []     elevRas = r"C:\Working01\DEM.img"     outWS = r"C:\Working01\Test01"      cores = multiprocessing.cpu_count()      # Start pool     pool = multiprocessing.Pool(cores - 1)     jobs = []      counter = 1     for fc in fcs:         jobs.append(pool.apply_async(calcVS, (fc, elevRas, outWS, counter)))      # Clean up pool...     pool.close()     pool.join()      # Print results     results = [job.get() for job in jobs]     print results



In this example I have taken into consideration something I saw in another post about needing to create separate scratch workspaces for arc to access on each run... [http://gis.stackexchange.com/questions/17869/multiprocessing-errors-arcgis-implementation]

So, my problem thus far, is that when I have everything set up in this way, the multiprocessing kicks in and all the cores fire up and start thinking. But when I monitor the output directory, nothing is written. I would think that even the first step of creating a new directory for the temp scratch workspace would execute, but it doesn't. I have commented out the CheckOutExtension just to see if each job will at least perform that part, but no luck.

The one error I do get doesn't help much, at least not for me but maybe it will turn on a light for someone else:

Traceback (most recent call last):
File "C:/Working01/VS_MP.py", line xx, in <module>
results = [job.get() for job in jobs]
File "C:\Python26\ArcGIS10.0\lib\multiprocessing\pool.py", line 422, in get
raise self._value
RuntimeError: <unprintable RuntimeError object>


I would suspect that it doesn't like me trying to call the get on whatever it is I return from the worker function, or the return error from the function is unable to be printed. I have done several things to try and trouble shoot this. I have commented out any return from the worker, just having it perform the viewshed calculation and save the raster. And then commenting out the "results" call. Am I running into problems here because of the way I'm using the asynchronous to add jobs? Do I need to do something with them that I'm not doing? I use this method with success on lots of other multiprocessing of vector data. I think I have tried, but will try again, using the pool.map function...

I know that's a lot to ask, but haven't seen any examples yet on using SA within python multiprocessing. Thanks in advance for any guidance...

Dwight

Outcomes