Detect Objects Using Deep Learning Error with new RTX 3060

Hulseyj · ‎04-21-2021

Error appears when I use "Detect Objects Using Deep Learning" in ArcGIS Pro with new RTX3060 GPU. The model used is usa_building_footprints.dlpk. I am using ArcGIS Pro 2.7.3. I have no issues running the model with my GTX 1650 Super GPU, same computer, same settings. It appears ArcPro engages the GPU as the application is shown running on the GPU:

However, the program shows as "Running" but never progresses past 0%.

After about 30 minutes of "running", the following error message is returned:

I used the Input Parameters and Environment below. Again, it runs fine with the GTX 1650 Super GPU. All drivers are up to date on the new RTX3060.

Thank you for help or troubleshooting ideas.

Samaloysius · ‎04-26-2021

Hi everyone ,

Thank you for your comments. Due to potential breaking changes, For ArcGIS Pro 2.8, the versions have not been updated. We are looking into upgrading the versions of CUDA , PyTorch, TensorFlow for the next release.

NathanPamperin1 · ‎04-27-2021

So that would be ~November for Pro v2.9?

Samaloysius · ‎04-27-2021

That is the timeframe for Pro v2.9 , but are looking into ways to possibly provide support for this earlier. Apologies for the inconvenience.

NathanPamperin1 · ‎04-27-2021

Thanks for the info and I appreciate the efforts to try to come up with a solution ahead of v2.9!

_Cartographer_ · ‎05-04-2021

Any possible support before v2.9 would be seriously appreciated. That's a long time to wait -- and many more users will encounter this problem in the meantime. I have already volunteered to be an early adopter to help test if needed -- at least then I could use my GPU as originally intended. I'm sure a number of us would be happy to alpha/beta test if it meant getting this issue moving forward.

NathanPamperin1 · ‎05-04-2021

I’d be happy to test as well!

Samaloysius · ‎05-05-2021

Thank you Nathan!

Samaloysius · ‎05-05-2021

Thank you . We are working on it.

Anonymous User · ‎05-06-2021

Hey, so I was referred here from GitHub. Same story for me with RTX 3080. Using the Imagery Analyst DL toolset for me in ArcGIS Pro usually results in waiting 30 minutes for mysterious CUDA errors or false successes. Anyway, here is a list of issues I have encountered, some of which may not be directly related to GPU, but the "CUDA Assert" and "NaN" errors (resulting from training epochs) are big problems:

Just for the record, issues I've run into with my RTX 3080:
- Many "CUDA Assert" errors throughout my attempts to use tools in ArcGIS Pro.
- Create image chips tool refusing to process RCNN files for no apparent reason; PASCALs worked.
- False success with DL tools that claim to have been successful, but nothing whatsoever results from running the final model, even just to do a basic test for any result.
- Certain elements in ArcGIS Notebook not working at all, like show batch samples, which work in external Jupyter Notebook (though this could be unrelated to GPU).
- In external Jupyter Notebook/Training Model: Optimal Learning Rate graph feels buggy.
- In external Jupyter Notebook/Training Model: Training the model in any instance results in "NaN" values per epoch statistics, which is probably linked to previous point where ArcGIS DL Tools were "successful" but nothing meaningful was ever produced.

samaloysius4 · ‎05-13-2021

Hi All,

Thank you for providing details on your workflows and your interest in using the deep learning tools on Ampere based GPUs. As you have discovered, Ampere cards are not consistently supported by CUDA 10 based solutions. The deep learning collection includes eight separate GPU backed packages, which are all compiled against the CUDA Toolkit 10.1 release. Unfortunately, some of these dependencies have not been fully updated to support Ampere currently, and CUDA 10.1 does not natively support the binary formats Ampere provides.

Our plan for Pro 2.8 is to continue distributing a collection of installers and package set targeted at CUDA Tookilt 10.1, and works for GPUs of the last Kepler generation (3.7 with Tesla K80), Maxwell, Pascal, Turing, and Volta GPUs. This collection of packages has gone through extensive validation, and covers the GPUs for a broad base of our users, and has upstream support from the packages we are redistributing or building.

For Ampere users, we have been working toward a technology preview that will be available by UC 2021, which will have some known limitations but allow key arcgis.learn, Deep Learning based geoprocessing tools and core libraries to function, using CUDA 11 and cubins for the CC 8.6 platform. There won't be an installer for this technology preview, but it will be available as a conda installable metapackage, and once we post this, we will be sure to respond to this issue as well.

For the next release of Pro which will arrive Q4 2021, we will support Ampere out of the box as a target for the deep learning installers, and will have a solution based on CUDA Toolkit 11.

Thank you for all your comments and apologizes again for the delay in support.