Not sure if ArcPy Parallel Processing Environment is working correctly

1134
1
Jump to solution
03-20-2021 02:24 PM
tigerwoulds
Occasional Contributor III

I have a python 3 script that grabs a set of 3500 rasters and mosaics them using the Mosaic to New Raster tool. 

At the start of my script I have set the parallel processing environment setting to 100%

arcpy.env.parallelProcessingFactor = "100%"

The machine I am running has 20 virtual  processors and 64 GB of RAM. 

I'm only getting 6% CPU utilization and only 8 GB of RAM usage. My understanding is that using parallel processing at 100% would use all the cores. Is there something obvious I'm missing or need to change to make the script run faster?
CPU.PNG

0 Kudos
1 Solution

Accepted Solutions
JayantaPoddar
MVP Esteemed Contributor

You may increase the parallel processing factor.

Take a smaller sample of Rasters (say 100). Test with parallel processing factor of 150% or 200%, and check how it compares to 100% (Time taken and impact on System performance).

Parallel Processing Factor

Specifying more processes than your machine has cores may incur a performance penalty. This is because multiple processes will compete for resources on one core. To specify the environment in a way that avoids this competition, you can use either a percent value less than 100% or a number of processes less than the number of cores on your machine.
However, for cases in which all your processes are I/O bound to a disk or to an enterprise database connection, you may get better performance by specifying more processes than you have cores. For example, the Add Rasters to Mosaic Dataset tool is I/O bound when the mosaic dataset is stored in an enterprise database. Also, the Build Overviews tool is primarily I/O bound to the disk. You can use more processes than your machine has cores by specifying either a percent value greater than 100% or a number of processes greater than the number of cores on your machine. For example, if you have a 4-core machine, specifying 8 or 200% will spread operations over 8 processes.



Think Location

View solution in original post

1 Reply
JayantaPoddar
MVP Esteemed Contributor

You may increase the parallel processing factor.

Take a smaller sample of Rasters (say 100). Test with parallel processing factor of 150% or 200%, and check how it compares to 100% (Time taken and impact on System performance).

Parallel Processing Factor

Specifying more processes than your machine has cores may incur a performance penalty. This is because multiple processes will compete for resources on one core. To specify the environment in a way that avoids this competition, you can use either a percent value less than 100% or a number of processes less than the number of cores on your machine.
However, for cases in which all your processes are I/O bound to a disk or to an enterprise database connection, you may get better performance by specifying more processes than you have cores. For example, the Add Rasters to Mosaic Dataset tool is I/O bound when the mosaic dataset is stored in an enterprise database. Also, the Build Overviews tool is primarily I/O bound to the disk. You can use more processes than your machine has cores by specifying either a percent value greater than 100% or a number of processes greater than the number of cores on your machine. For example, if you have a 4-core machine, specifying 8 or 200% will spread operations over 8 processes.



Think Location