curtvprice

Using env.parallelProcessingFactor in Python

Blog Post created by curtvprice Champion on Apr 7, 2016

I have been attempting to get into Python multiprocessing. Just a warning, it's not for the faint of heart. It's great when it works, but there are many gotchas! Here's some background on that if you want to read up:

If your python script tool uses parallel processing, you may want your tool to recognize a new environment setting recently added to ArcGIS: parallelProcessingFactor:

Parallel Processing Factor (Environment setting)—Help | ArcGIS for Desktop

 

from the help article:

The value of this environment determines the number of processes across which a tool spreads its operation. Those processes will be divided between hardware cores (processors) built into the machine. The number of hardware cores does not change based on this setting.

Here's a function that will convert this setting to the number of cpus argument used for multiprocessing.Pool()

UPDATE: now supports 0 - this value can be used to signal your script that no multiprocessing should take place. Also I added better exception handling.

 

def arcpy_cpus(default=None):
    """Determine number of CPUs to use with Python multiprocessing.Pool()
    based on arcpy.env.parallelProcessingFactor.
    """
    import multiprocessing
    import arcpy
    ppf = arcpy.env.parallelProcessingFactor
    if ppf in ["", None]:
        if default in ["", None]:
            ppf = multiprocessing.cpu_count() - 1
        else:
            ppf = default
    if str(ppf) == "0":
        cpus = 0 # no multiprocessing
    try:
        if '%' in str(ppf):
            pct = float(ppf.replace("%",""))
            cpus = int(multiprocessing.cpu_count() *  (pct / 100.0))
        else:
            cpus = int(ppf)
            if cpus < 0:
                raise
    except Exception as msg:
        raise Exception("Invalid parallelProcessingFactor \"{}\"".format(ppf))
    return cpus

 

Here are some test runs of this function, so you can see what it returns with different parameters:

 

print("{:8} {:5}  {:5}".format("env", "def=\"\"", "def=2"))
for k in ["0", "1", "", None, "1%", "2%", "50%", "100%", "200%"]: 
   env.parallelProcessingFactor = k
   print ("{!r:8} {:5}  {:5}".format(k, arcpy_cpus(), arcpy_cpus(2)))


env      def=""  def=2
'0'          0      0
'1'          1      1
''           7      2
None         7      2
'1%'         0      0
'2%'         0      0
'50%'        4      4
'100%'       8      8
'200%'      16     16

Outcomes