Arcpy Multiprocessor issues

Discussion created by stacyrendall on Jun 27, 2011
Latest reply on Apr 5, 2013 by philthiem

I am currently writing an arcpy script that churns through some large datasets and extracts data; to make this quicker I figured it might be worth using something like Parallel Python or the inbuilt Multiprocessing library. Ideally I require the script to be run from within Arc (well... there seems to be an inescapable Schema Lock problem using direct iterators within the script, which can be circumvented by using modelbuilder iterators that run the script once per iteration, but that is another story).

However, I can get a script by either method to run fine outside of Arc (still calling arcpy), but running from within Arc the processing job gets part way through before failing:

  • with Parallel Python it gets to the point where job_server = pp.Server() is defined, then opens another instance of ArcMap if the script is being run from ArcMap, or Catalog if running from Catalog. From Catalog the script sits waiting (progress bar doing its swishy thing, but no processing occurring) until this new instance is closed, at which point the script fails with the following:<class 'struct.error'>: unpack requires a string argument of length 8

  • Failed to execute (scriptPP). 
    When running from ArcMap, the new instance initially provides an error dialog stating â??Could not open the specified fileâ?�, after clicking OK it then proceeds as above (same error as from Catalog, only occurs after closing ArcMap). This usually also causes AppROT to fail as well. 

  • With Multiprocessing it gets as far as actually running the module, again attempts to open a new instance of whichever program it was started from. If running from ArcMap it opens an instance for each process initiated; each of which starts with an error dialog stating â??Could not find file: from multiprocessing.forking import main; main().mxdâ?�, upon clicking OK the instance opens as normal. Upon closing all new instances, the initial ArcMap becomes unresponsive (sometimes the processing dialog is still there, swishing away, sometimes it has disappeared), although there is no error message.

  • Running from Catalog, it brings up an error dialog stating that â??C:\Program Files (x86)\ArcGIS\Desktop10.0\Bin\from multiprocessing.forking import main; main() was not found.â?� Once for each instance. Then the progress dialog and ArcCatalog both become unresponsiveâ?¦  

I have created a simple python script for Multiprocessing which can reproduce the error; running fine if run directly, but causing errors as stated above when running from Arc, see below. Is anyone else familiar with these issues? Can they be confirmed on other setups? It appears that Arc is somehow trying to open parts of the modules that the processes are using, but I am not sure why. Any help would be greatly appreciated!

import arcpy 
import multiprocessing 

def simpleFunction(arg): 
return arg*3 

if __name__ == "__main__": 
arcpy.AddMessage(" Multiprocessing test...") 
jobs = [] 
pool = multiprocessing.Pool() 
for i in range(10): 
job = pool.apply_async(simpleFunction, (i,)) 

for job in jobs: # collect results from the job server 
print job.get() 

del job, jobs 
arcpy.AddMessage(" Complete") 
print "Complete"

ArcGIS Desktop 10.0 sp2, ArcInfo Licence running on Windows 7 with Core i7 and 8GB RAM.