Train deep learning model in multiple GPUs for Python API

1116
7
Jump to solution
11-19-2022 03:03 AM
JadedEarth
Occasional Contributor

I'm using ArcGIS Pro v2.9.2 with Image Analyst license.  I've trying to train a MaskRCNN model using multiple GPUs on a single machine but I can't seem to find a sample code.  Most are distributed codes (multiple machines with multiple GPUs).

Would someone show me codes for training models for a single machine, multiple GPUs in a Python Window environment on ArcGIS Pro.  I have my own training data generated using Image Analyst's Export training Data.

Appreciate any help.

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
JadedEarth
Occasional Contributor

Finally got through ESRI Tech support--after twists and turns.  It turns out, this might be a bug in the system.  They've placed this on their to-do list but couldn't tell me if this will be included in the next patch.  This issue exists in version 2.9, 2.9.5, and 3.1.

Just so you know.

View solution in original post

0 Kudos
7 Replies
DanPatterson
MVP Esteemed Contributor

I will move this thread to Imagery and Remote Sensing Questions

since your previous, closed thread went unresolved.  Perhaps someone here from the Imagery team will have a solution


... sort of retired...
0 Kudos
JadedEarth
Occasional Contributor

I can't find where this thread went in Imagery and Remote Sensing Questions.  However, I have a follow-up with regards to using multiple GPUs in a single machine.

  I'm using ArcGIS Pro version 3.1.4 with 7 GPUs.  I have an ArcGIS Pro Online, named-instance license and an Image Analyst license.

From "Train Deep Learning Model" Image Analyst tool (aka tool.script.execute.py script), there is this script at the start:

#--------------------------------------------------------------------------------------------------------------------------------

if arcpy.env.processorType == "GPU" and torch.cuda.is_available() and arcpy.env.gpuId:
# use specific gpu if gpuId is specified, use all available gpus if no gpuID is specified

elif not arcpy.env.processorType:
# use all available gpus if processor type is not specified(default), gpuID is ignored in this case
arcgis.env._processorType = "GPU"

else:
arcgis.env._processorType = arcpy.env.processorType

#--------------------------------------------------------------------------------------------------------------------------------

It seems like the intent was that, if I specify the processor = "GPU" and I leave GPU ID blank, then I can use all the available GPUs in my machine.  Here are the results of my trial settings for CPU/GPU on my machine:

Machine 1:  2 CPUs @ 28-core each; logical processors=56 cpus; 7 GPUs (1 RTX A6000 ID=4; and 6 RTX A4000 ID=0,1,2,3,5,6)

Settings 1:
    Processor = Blank
    GPU ID = Blank
    Actual CPUs Used = 100% logical processors running (56 cores)
    Actual GPU Used = 1 GPU Used (RTX A6000)

Settings 2:
    Processor = GPU
    GPU ID = Blank
    Actual CPUs Used = 100% logical processors running (56 cores)
    Actual GPU Used = 1 GPU Used (RTX A6000)

Settings 3:
    Processor = GPU
    GPU ID = 4
    Actual CPUs Used = 100% logical processors running (56 cores)
    Actual GPU Used = 1 GPU USED ID=1 (RTS A4000; Not the specified ID)

Settings 4:
    Processor = CPU
    GPU ID = Blank
    Actual CPUs Used = 100% logical processors running (56 cores)
    Actual GPU Used = None

Settings 5:
    Processor = CPU
    GPU ID = 4
    Actual CPUs Used = 100% logical processors running (56 cores)
    Actual GPU Used = None

Can someone at ESRI check why this is?  Is this just a bug, or was this the intent for ArcGIS Pro Online licenses?

Appreciate any help.

0 Kudos
JadedEarth
Occasional Contributor

Is anybody there?  How can I get any technical help?  I feel like I'm talking to myself here.  Do I have to wait a year before someone from ESRI replies?  And why is there only a reply button instead of "Post" button?

0 Kudos
DanPatterson
MVP Esteemed Contributor

Contact Technical Support.  Esri staff doesn't normally follow Community to answer technical support issues.


... sort of retired...
0 Kudos
JadedEarth
Occasional Contributor

They just don't make it easy.  I submitted a request, which means it goes to HQ in DC and hopefully I get a response after a week or two.  Sigh...

0 Kudos
JadedEarth
Occasional Contributor

Finally got through ESRI Tech support--after twists and turns.  It turns out, this might be a bug in the system.  They've placed this on their to-do list but couldn't tell me if this will be included in the next patch.  This issue exists in version 2.9, 2.9.5, and 3.1.

Just so you know.

0 Kudos
PavanYadav
Esri Contributor

please see my response on https://community.esri.com/t5/arcgis-pro-questions/train-deep-learning-model-using-multiple-gpus-on/...

Cheers!

Pavan Yadav | Product Engineer - Imagery and AI
Esri | 380 New York | Redlands, 92373 | USA

https://www.linkedin.com/in/pavan-yadav-1846606/ 

0 Kudos