Select to view content in your preferred language

Using multiprocessing in gp service

279
4
3 weeks ago
ModyBuchbinder
Esri Regular Contributor

I would like to use multiprocessing module in python to use more cores in process that I can split to independent parts.

I create a small script. Running it in Pro works (but open a CMD window) and I could publish it to the server.

It does not work on the server and gives no error message.

Second test that I tried is to write a python that uses multiprocessing. Then create a short bat file that call this script.

Then my main script in pro just do os.system(bat file).

Again, runs in Pro, published with no comments but does not run on server.

Found this: https://community.esri.com/t5/python-questions/usage-of-multiprocessing-in-a-python-based/m-p/50820#... from few years back with no answer.

Anybody have an idea?

Thanks

0 Kudos
4 Replies
JoshuaBixby
MVP Esteemed Contributor

What Message Level do you have the GP service set to?  If it is set to Info, the client should get all of the messages and errors back from the job.  

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

For someone who has designed, configured, and managed many ArcGIS Enterprise deployments supporting upwards of tens of thousands of GIS services, using multiprocessing in a geoprocessing service comes with enormous risks to the stability of the service and the application as a whole.  Depending on how the multiprocessing jobs are setup, and how the GP service is configured, it would be very easy for a single GP service to overwhelm all of the resources on an ArcGIS Enterprise deployment.  Maybe this has already been thought through and discussed with the OS admins, but I mention it here as a general warning to others who might come across this thread.

ModyBuchbinder
Esri Regular Contributor

Hi all

My message setting is info and I get NO messages.

I am aware of the resources problem. If you use multiprocessing wisely you can limit the number of parallel processes.

If you have a server with many cores and each gp service do a lot of work but does not called very often you should be able to get better performance.

Currently the tool just ended without calling the function I need to run in parallel.

Again, it works on Pro so my code is working. Something in the server env stopping it.

Thanks

 

0 Kudos
ModyBuchbinder
Esri Regular Contributor

 I found the way to do it but it still in testing.

First you create and publish a separate script ("child") that do the part that should be executed in parallel.

It gets parameters just like any gp service.

Then you write a tool that run gp service (use this: https://enterprise.arcgis.com/en/server/10.8/publish-services/windows/using-a-service-in-python-scri...  ).

In such a script you can control the number of parallel instances, giving each one a different variable and then check that all instances are finished before continue the father script that aggregate or append the results.

the child script should return his private results to the caller somehow (url to the results or string).

After the father script was running in Pro I had not problem publishing it as gp service as a "father" service.

Running this father service I could see the number of running processes for the "child" service going up when the father script run a few instances of the child in parallel. 

0 Kudos