JamesRamm

Issues with multiprocessing and spatial analyst

Discussion created by JamesRamm on Feb 27, 2014
Latest reply on Sep 26, 2014 by JamesRamm
I have read every blog post and thread I can find on multiprocessing with arcpy and none of the fixes in them have fully addressed my problem.

I'm trying to do a relatively simple watershed calculation using multiprocessing.
The 'worker' function looks like this:
def multi_watershed(pnts, branchID, flowdir, flowacc, scratchWks):

    direc = tempfile.mkdtemp(dir = scratchWks) # If called in a pll process, needs to write to seperate directories
    arcpy.env.scratchWorkspace = direc    

    polylist = []
    for i, p in enumerate(pnts):
        pnt = arcpy.PointGeometry(arcpy.Point(p.x, p.y, ID=i))  #Convert the shapely point to an arcpy point
        pourpt = sa.SnapPourPoint(pnt, flowacc, 1000) 
        ws = sa.Watershed(flowdir, pourpt)  
        out = os.path.join(direc, "pol_%i"%i) #Generate a filename for the output polygon
        arcpy.RasterToPolygon_conversion(ws, out)
        polylist.append(out) #Append the output file to the list to be returned
    res = (branchID, polylist)
    return res


Given a list of points, it snaps the point to high flow accumulation, calculates the watershed and converts it to a polygon.
Using one process, this works fine.

I have a dictionary where each value is a list of points and I am trying to do the multiprocessing over this dictionary. The multiprocessing function looks like this:

def watershed_pll(data, flowdir, flowacc, tempfolder, proc=4):
    """ Calculate the watershed for each station point using parallel processing """
    pool = Pool(processes = proc)
    jobs = []
    for key, val in data.iteritems():
        jobs.append(pool.apply_async(multi_watershed, (val, key, flowdir, flowacc, temp)))    
    pool.close()
    pool.join()                   
    return jobs


It is as simple as can be and just returns the list of 'Apply_Result' objects. I then run this function from a script.
When using multiprocessing, sometimes it works, but more often than not I get one of these errors:

ERROR 010088: Invalid input geodataset (Layer, Tin, etc.).]

or

Unable to remove directory.  Possible causes:
1- Not owner of the directory
2- Another person or application is accessing this directory

or even

FATAL ERROR(INFADI)
MISSING FILE OR DIRECTORY

There seems to be no pattern as to if/when these errors will occur and which one it will be...

Any ideas?

Outcomes