I am trying to create a script tool in ArcGIS Pro 2.3, which calls existing python modules where multiprocessing is taking place.
When the parallel processes are started, instead of new python processes, several new instances of ArcGISPro are opened.
A very simplified code for a script tool to illustrate this:
import multiprocessing, arcpy, sys
from multiprocessing import Pool
text = arcpy.GetParameterAsText(0)
def testFunc(param):
return "Processed: " + param
if __name__ == "__main__":
cpuNum = multiprocessing.cpu_count() - 1
pool = Pool(processes=cpuNum)
newText = pool.map(testFunc, text)
pool.close
pool.join
print("Output :" + str(newText))
When running the same code from command prompt, it works fine:
How can this code be executed successfully from a script tool in ArcGIS Pro?
Multiprocessing is probably trying to use ArcGISPro.exe instead of python.exe to run the child processes. Either disable running the script 'in process' or use multiprocessing.set_executable(os.path.join(sys.exec_prefix, 'python.exe'))
Note: don't use sys.executable
to get the path to python.exe if running in-process as it will be pointing to ArcGISPro.exe, use sys.exec_prefix
instead:
>>> import os, sys
>>> print(sys.executable)
C:\Program Files\ArcGIS\Pro\bin\ArcGISPro.exe
>>> print(os.path.join(sys.exec_prefix, 'python.exe'))
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\python.exe
This has never worked for me in ArcMap either -- to get multiprocessing to work I have needed to run a standalone script (running out-of-process [script tool setting] would work too, if you are careful to not use sys.executable as you suggest.
The option to disable running the script "in process" was an option in ArcMap, I have not seen this in ArcGIS Pro.
I have also tried to use multiprocessing.set_executable(os.path.join(sys.exec_prefix, 'python.exe'))
The code executed without errors, and produced the expected results.
BUT, I ended up having several python windows open, and they did not disappear until I closed ArcGIS Pro.
My solution:
As I mentioned earlier, multiprocessing in my case are taken place in existing python modules.
I ended up calling these from the code behind my script tool using subprocess:
cmd = '"' + os.path.join(sys.exec_prefix, 'python.exe" "'C:\temp\ScriptUsingMultiprocessing.py"')
completed = subprocess.run(cmd, shell=True)
This starts the code with multiprocessing, and waits until it is completed before moving on.
Works perfectly!
I'd love to get some more chatter on this topic...I have been having some great success with getting multiprocessing to work within ArcGIS Pro script tools, but I am still having trouble with Python running in the background, i.e. running without opening a console window.
I implemented the following line of code in my script, just as Kjetil Trengereid suggested, and it works well, except for opening the Python consoles for each pooled process, which I prefer to be hidden.
multiprocessing.set_executable(os.path.join(sys.exec_prefix, 'python.exe'))
I learned that if you use 'pythonw.exe'
that it is meant for GUI python and will not open Python consoles:
multiprocessing.set_executable(os.path.join(sys.exec_prefix, 'pythonw.exe'))
^
I got this to work very well initially, but now out of the blue, it has started to lock up and stop processing on the first of many multiprocessing functions with no known reason that I can find. It has been stopping during a multiprocess function that has always worked otherwise with python.exe.
In checking the Task Manager, there are some differences with how the multiprocessing programs run on the different executables. When using python.exe, the parallel processes open the Python consoles separately from ArcGIS Pro as completely separate tasks, and when all processes are done, all console windows close, and then if a new multiprocess function is ran later in the code, all new console windows are spawned.
With pythonw.exe, instances show up in Task Manager under the ArcGIS Pro app (you have to click the drop down to see them). What has been happening when things lock up is the CPU column in Task Manager for each instance drops to 0%, the memory shows a value, but goes stagnant. All other columns drop to zero also. Then, the script tool just hangs in ArcGIS Pro and will not progress; the hanging has persisted for hours in one case (overnight...).
I've tried rebooting ArcGIS Pro and my computer to no avail. I also tried using one less CPU than is available and that didn't seem to change anything either.
Is this a memory issue? Task Manager doesn't seem to report anything out of the ordinary and other program will still run. I am working with lidar LAS files; they are not super big by themselves (125 MB or less) but I am doing 40+ files for each multiprocessing function.
Is it an ArcGIS Pro issue? Or a Python code issue? I can share code if needed, but I will need some time to creates generic examples of the code, as the code is proprietary (sorry!!).
I also still have some more experimenting to do. For example, it is a very long, robust script that I could break up into more manageable scripts if needed; not preferable, but doable. I have also not included if __name__ == '__main__':
in my scripts and adding different functions for the multiprocessing parts, which has been a suggestion also. I have created a module for the multiprocessing functions. See Parallel Processing with Python Toolboxes in ArcGIS Pro and arcgis 10.3 - Can multiprocessing with arcpy be run in a script tool? - Geographic Information Syste... for what I mean here.
Any suggestions will be greatly appreciated. Thanks.
A quick follow up to my previous post...Now, pythonw.exe is not hanging anymore and everything is working as expected...not sure what the deal was. I did reorder a few processes in the python script and I had a Windows update and few other updates come through since my previous post, so maybe that fixed the issue. I also cleaned up my geoprocessing history in the ArcGIS Pro Project file. Not sure what did it, but the issue described previously is no longer present.
I am having this same issue, but the fix that Kjetil Sverdrup-Thygeson has provided is not working for me. The Python windows do not close at all.
Which version of Pro?
I'm interested in this as well, I work a lot with spatial dataframes in memory and it would be nice to be able to break a dataframe up into chunks, process each chunk with it's own process then concat it back into a single dataframe and move onto the next steps.