I have a python 3 script that grabs a set of 3500 rasters and mosaics them using the Mosaic to New Raster tool.
At the start of my script I have set the parallel processing environment setting to 100%
arcpy.env.parallelProcessingFactor = "100%"
The machine I am running has 20 virtual processors and 64 GB of RAM.
I'm only getting 6% CPU utilization and only 8 GB of RAM usage. My understanding is that using parallel processing at 100% would use all the cores. Is there something obvious I'm missing or need to change to make the script run faster?
Solved! Go to Solution.
You may increase the parallel processing factor.
Take a smaller sample of Rasters (say 100). Test with parallel processing factor of 150% or 200%, and check how it compares to 100% (Time taken and impact on System performance).
Specifying more processes than your machine has cores may incur a performance penalty. This is because multiple processes will compete for resources on one core. To specify the environment in a way that avoids this competition, you can use either a percent value less than 100% or a number of processes less than the number of cores on your machine.
However, for cases in which all your processes are I/O bound to a disk or to an enterprise database connection, you may get better performance by specifying more processes than you have cores. For example, the Add Rasters to Mosaic Dataset tool is I/O bound when the mosaic dataset is stored in an enterprise database. Also, the Build Overviews tool is primarily I/O bound to the disk. You can use more processes than your machine has cores by specifying either a percent value greater than 100% or a number of processes greater than the number of cores on your machine. For example, if you have a 4-core machine, specifying 8 or 200% will spread operations over 8 processes.
You may increase the parallel processing factor.
Take a smaller sample of Rasters (say 100). Test with parallel processing factor of 150% or 200%, and check how it compares to 100% (Time taken and impact on System performance).
Specifying more processes than your machine has cores may incur a performance penalty. This is because multiple processes will compete for resources on one core. To specify the environment in a way that avoids this competition, you can use either a percent value less than 100% or a number of processes less than the number of cores on your machine.
However, for cases in which all your processes are I/O bound to a disk or to an enterprise database connection, you may get better performance by specifying more processes than you have cores. For example, the Add Rasters to Mosaic Dataset tool is I/O bound when the mosaic dataset is stored in an enterprise database. Also, the Build Overviews tool is primarily I/O bound to the disk. You can use more processes than your machine has cores by specifying either a percent value greater than 100% or a number of processes greater than the number of cores on your machine. For example, if you have a 4-core machine, specifying 8 or 200% will spread operations over 8 processes.