Multiprocessing: Arguments and Errors

Discussion created by dwightlanier on Apr 30, 2013
Latest reply on Apr 30, 2013 by mahunter243
Hi all,

Using ArcGIS 10.0 SP3, win 7

First time working through using multiprocessing.  What I have right now works up until everything goes into multiprocessing (the CPU use spikes for all processors, etc.), but then nothing is accomplished for any of the subprocesses.  One of the first statements inside the function being called is to print a simple line that should ID the data being worked on or just let me know that it's alive.  After that are some simple folder creations steps using os, but no statement is printed and no folders are ever created an no errors are thrown.  The CPU use drops off to 1% after the first couple of seconds and it just sits there, never really erroring out, or seeming to do anything.

I realize I probably have some fundamental misunderstanding about how the processing works using this module, and would greatly appreciate any help with few questions...may even be something related to the version of ArcMap that's been addressed in newer versions...

First would be how best to pass arguments along to the function when called using multiprocessing pools.

All of the examples I've seen have something like the following (not quite pseudocode :p):

import os
import multiprocessing
import arcpy

# Define function to do work...
def someFunction(shp):
    do this with shp

# Data to do work on...
fcs = [shp1, shp2, shp3]

# Call multiprocessing module...
pool = multiprocessing.Pool()
pool.map(someFunction, fcs)

In examples of this type that I have seen, the function that will do the work is identified, and only a list of items that will be worked on and assigned to their own processes are passed along.  My question is, what if I have a function that takes multiple arguments?  Should it work if I submit something like the following?:

fcs = [[shp1, 500], [shp2, 350], [shp3, 200]]

So that a list is being passed into the function from the pool.map() for each item and it's arguments?

And in case this was not correct I also tried it this way so that only one item of information is being passed, and then parsed later in the function:

fcs = ["shp1,500", "shp2,350", "shp3,200"]

def someFunction(shp):
    shp = re.split(",", shp)[0]
    dist = re.split(",", shp)[1]

None of this is working either way, just wanting to see if anyone can shed some light on this for me.

Second part is looking for clarification on what to expect in terms of error messages, print statements, etc. from inside multiprocessing calls.  If I want to see where things may be going wrong, how do I do this?  I have tried using print statements inside the function being called, but nothing prints.  I have tried try/except loops that use Exception, e and print, e (also tried writing to logfiles), but no luck.

Thanks for any info,