I had created a python script that used the python multiprocessing module to take advantage of a multi-core computer. This was created in ArcMap 10.3. It ran fine in IDLE but when I attempted to wire it into a Script Tool interface so I could expose it as a Tool in ArcToolbox I started to have problems...
With some great help from the community on GIS SE I was able to finally get it working, the solution was rather obscure so I am documenting it here for others. As you will see this was doing nothing amazing but it does provide a template to get you up and going and hopefully you won't waste time as I did trying to work out why it was not working...
I have a polyline dataset that I want to clip to a polygon dataset. Due to the nature of the polygon dataset the clip tool was taking a long time to process. So I decided to use multiprocessing to speed things up.
My script that was wired up to a Script Tool interface is very simple and is shown below. Note it imports another module I called multicode. This python file sits in the same folder as this script.
''' Title: multiexample Description: This simple script is the script that is wired up into toolbox ''' import arcpy import multicode # Get parameters (These are Layer objects) clipper = arcpy.GetParameter(0) tobeclipped = arcpy.GetParameter(1) def main(): arcpy.AddMessage("Calling multiprocessing code...") multicode.multi(clipper,tobeclipped) if __name__ == '__main__': main()
Important: When wiring up this script to an interface make sure Run Python script in process is UNTICKED under the source tab!
My main multiprocessing code is shown below, take note of the limitations. I've tried to document it so that it is understandable.
""" Title: multicode Description: The module that does multicore clipping. It will take as input a polygon layer and another layer. For each polygon it will clip the dataset into a new separate shapefile. Limitations: This code expects the folder c:\temp\tc to exist, this is where the output ends up. As geoprocessing objects cannot be "pickled" the full path to the dataset is passed to the worker function. This means that any selection on the input clipper layer is ignored. Author: Duncan Hornby (email@example.com) Created: 2/4/15 """ import os,sys import arcpy import multiprocessing from functools import partial def doWork(clipper,tobeclipped,oid): """ Title: doWork Description: This is the function that gets called as does the work. The parameter oid comes from the idList when the function is mapped by pool.map(func,idList) in the multi function. Note that this function does not try to write to arcpy.AddMessage() as nothing is ever displayed. If the clip succeeds then it returns TRUE else FALSE. """ try: # Each clipper layer needs a unique name, so use oid arcpy.MakeFeatureLayer_management(clipper,"clipper_" + str(oid)) # Select the polygon in the layer, this means the clip tool will use only that polygon descObj = arcpy.Describe(clipper) field = descObj.OIDFieldName df = arcpy.AddFieldDelimiters(clipper,field) query = df + " = " + str(oid) arcpy.SelectLayerByAttribute_management("clipper_" + str(oid),"NEW_SELECTION",query) # Do the clip outFC = r"c:\temp\tc\clip_" + str(oid) + ".shp" arcpy.Clip_analysis(tobeclipped,"clipper_" + str(oid),outFC) return True except: # Some error occurred so return False return False def multi(clipper,tobeclipped): try: arcpy.env.overwriteOutput = True # Create a list of object IDs for clipper polygons arcpy.AddMessage("Creating Polygon OID list...") descObj = arcpy.Describe(clipper) field = descObj.OIDFieldName idList =  with arcpy.da.SearchCursor(clipper,[field]) as cursor: for row in cursor: id = row idList.append(id) arcpy.AddMessage("There are " + str(len(idList)) + " object IDs (polygons) to process.") # Call doWork function, this function is called as many OIDS in idList # This line creates a "pointer" to the real function but its a nifty way for declaring parameters. # Note the layer objects are passing their full path as layer objects cannot be pickled func = partial(doWork,clipper.dataSource,tobeclipped.dataSource) arcpy.AddMessage("Sending to pool") # declare number of cores to use, use 1 less than the max cpuNum = multiprocessing.cpu_count() - 1 # Create the pool object pool = multiprocessing.Pool(processes=cpuNum) # Fire off list to worker function. # res is a list that is created with what ever the worker function is returning res = pool.map(func,idList) pool.close() pool.join() # If an error has occurred report it if False in res: arcpy.AddError("A worker failed!") arcpy.AddMessage("Finished multiprocessing!") except arcpy.ExecuteError: # Geoprocessor threw an error arcpy.AddError(arcpy.GetMessages(2)) except Exception as e: # Capture all other errors arcpy.AddError(str(e))
When I ran the tool from ArcToolbox it immediately bombed out with a 000714 error, far too quickly for it to be a silly syntax error by me. Again GIS SE came to the rescue and it turned out to be an issue with which version of Python that was being used. If you open the python command line window in ArcMap and type the following:
You will see '2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]' which tells you that it is using the 32 bit version. Running my script from Arctoolbox turns out to be running the 64 bit version which I assume was installed when I installed the 64 bit background geo-processing. This was upsetting ArcToolbox.
So how did I solve this? I went to my py file in file explorer, right clicked and went to open with > Choose default programs and browsed to C:\Python27\ArcGIS10.3\pythonw.exe. Once the default application to open python files was reset to the 32 bit version the Script Tool ran without error and I could see all the cores on my machine max out.