Viewshed 2 unable to locate GPU though it is available

2474
2
05-24-2018 10:28 AM
PaulStatham
New Contributor

System has two GPUs in it that are verifiable via DxDiag and the Nvidia Control Panel

GPU 0: WDDM

GPU 1: TCC

System environment variables:

CUDA_VISIBLE_DEVICES=1

Was running Viewshed 2 just fine yesterday, suddenly stopped recognizing GPU 1 and kept trying to use GPU 0 which was being used to show display; obviously this meant doing any display wise work was not optional.

In the Nvidia Control Panel, set GPU 0 Usage Mode to "Dedicate to graphics tasks" and this forces Viewshed 2 to use CPU.

Is there a setting somewhere that ArcPro would store erroneous CUDA_VISIBLE_DEVICES flags?

We cannot figure out why the application is saying there is no GPU available in TCC and specified when it clearly is.

0 Kudos
2 Replies
XuguangWang
Esri Contributor

Could you try removing the CUDA_VISIBLE_DEVICES variable ? Viewshed 2 should automatically find TCC card for computation.

Please also make sure the graphics driver for the TCC GPU is up to date.

PaulStatham
New Contributor

I believe that is what we ended up doing. We followed these instructions to begin with:

GPU processing with Spatial Analyst—Help | ArcGIS Desktop 

Then we read your blog and the NVIDIA programming guide:

Are you getting GPU errors while executing the Viewshed 2 tool? 

Programming Guide :: CUDA Toolkit Documentation 

Then we updated the Windows TDR settings by reading:

Timeout Detection and Recovery (TDR) | Microsoft Docs 

TDR Registry Keys | Microsoft Docs

Both of which said to add CUDA_VISIBLE_DEVICES. However, we later noticed that the first one notes the following:

"In the case of multiple GPUs in your system, the first GPU in the TCC (Tesla Compute Cluster) driver mode will be used by default. If there is no GPU available in the TCC driver mode, the first GPU (with index 0) will be used, unless specified otherwise."

So we reset the system (put all settings and drivers back to original states), then we changed the mode of the second GPU to TCC, then opened our NVIDIA control panel and went to:

Manage GPU Utilization -> Check "Use for compute needs" on TCC GPU

Manage 3D Settings -> Global Settings -> CUDA - GPUs -> Ensure TCC card was only card selected

What we noticed with ArcGIS Pro 2.0.0 was that even if the NVIDIA control panel Global Settings and ArcGIS Pro Settings (in the program specific pane) showed the TCC GPU as the available CUDA GPU, ArcGIS Pro would issue a noticed that it failed to located a GPU and would send it to the primary GPU in WDDM that we were using to display graphics.

To resolve we reset the system, as noted and performed these steps inline:

  1. Insert CUDA capable device into hardware slot
  2. Boot system and navigate to NVIDIA control panel
  3. Ensure both cards are visible
  4. Open CMD Prompt and navigate to C:\Program Files\NVIDIA Corporation\NVSMI
  5. Run nvidia-smi.exe
  6. Ensure both cards are currently in WDDM
  7. (Admin req.) Issue command:
    • reg add HKLM\System\CurrentControlSet\Control\GraphicsDrivers /t REG_DWORD /v TdrDelay /d 100 /f
  8. Reboot; Open REGEDIT and navigate to HKLM\System\CurrentControlSet\Control\GraphicsDrivers
  9. Verify that TdrDelay has been added as a value to the key path
  10. Repeat Step 4 and Step 5
  11. (Admin req.) Issue command:
    • nvidia-smi.exe -dm 1 -i ?
    • where ? is the numeric value located in the GPU position of the grid output by nvidia-smi.exe for desired TCC GPU
  12. (Admin req.) Issue command:
    • nvidia-smi.exe -e 0 i -i ?
    • where ? should be the same numeric value as Step 12
  13. Repeat Step 4 and Step 5
  14. Verify that the numeric input (?) now has the TCC/WDDM position set to TCC and the other to WDDM
    • To see graphical display the GPU connected to monitors must be in WDDM
    • A GPU in TCC mode can only be used for computing purposes not graphics tasking
  15. Open NVIDIA Control Panel
  16. On the left hand side navigate to Workstation > Manage GPU Utilization
  17. Ensure the TCC card has only the option "Use for compute needs" and is selected
  18. On the left hand side navigate to 3D Settings > Manage 3D Settings
  19. Select the Global Settings tab
  20. Select "Base Profile" in drop down menu
  21. Locate CUDA - GPUS drop down menu
  22. Ensure the TCC GPU is the only one selected
    • If the same GPU is in the system twice, should be enumerated in Step 17 as "1 of 2" and "2 of 2" by their names
    • This naming will be visible in these drop down menus
  23. Select the Program Settings tab
  24. Select applicable version of ArcGIS product
  25. Repeat Step 21
  26. Ensure set to "Use global setting ([GPU name here])