Usage of multiprocessing in a python based geoprocessing service

WhereMatters · ‎11-28-2017

Hi,

I have used multiprocessing in the past to perform some intensive geoprocessing and was able to speed up by using all cores . It worked quite well. These were all script tools that were used in Desktop. However, I had a question if multiprocessing can be used in a geoprocessing service and would that actually be sensible.

I am looking at a script that performs a series of geoprocessing steps over separate point locations and the problem is embarrassingly parallel. It takes between 3 to 4 minutes per location for processing. However, it also needs to be available as a gp service.

So lets say, a user submits a job and the server is using up all cores to process that job. Does it mean that if another user submits a job, he has to wait till the previous job completes and frees up the cores? This would mean the second user will have to wait for a long time to get results back.

I could set it up to use only half the cores, does that mean it will now support two concurrent users? The third one has to wait till one of the previous two completes and frees up some cores.

If you dont use multiprocessing, does it mean that you will be able to support more concurrent users, but the completion of each job will take a very long time since the geoprocessing steps are run in series.

It would be great to hear some thoughts and experiences on this from you guys!

Any other options to improve performance and scale, other than adding servers to the cluster?

Thanks.