Im running ArcGIS 2.7.1 and having issues with Classify Pixels Using Deep Learning. It runs fine when running it through the ArcGIS Pro greprocessing tool. But if I go to the History of the job just run and click “Send to Python Window”, the python command checkboards the output raster (below image). I think I have tracked it down to possible GPU memory issues. The ArcGIS geoprocess and python must be using different code, as from the Task Manager, the GPU memory signature is completely different and in fact the python code doesn’t release the Dedicate GPU memory when its finished (see below).
- The model is running resnet50, unet with fastai (resnet34 doesn’t seem to have this issue and works fine in python)
-The GPU is a GeForce RTX 2080 Ti (so 11 GBs)
- Changing the batch size to 1 doesn’t make a difference
- I think the python code might be crashing, but not reporting any issues to ArcGIS.
Has anyone managed to get similar to above working with resnet50 in python?
Does anybody have any ideas on what I could try to get it working?
GPU task manager when running through ArcGIS geoprocess tool
GPU task manager when running through python window in ArcGIS
Yes - once the process checkerboards after a python run - the 'History' (or original Geoprocess tool) will not work, until I close and restart ArcGIS Pro. So it is like once it checkboards the only way to get it to work again is too restart ArcGIS Pro.
Thanks for your response. If I change the batch size to 1 and run the python code it still checkerboards. It still feels like it is running out of memory, is there a way to see any python/ArcGIS logs?
Might be a stretch but could be resolved with a similar fix to https://community.esri.com/t5/arcgis-spatial-analyst-blog/are-you-getting-gpu-error-while-executing-....
The default value of 2 seconds in the Windows timeout detection and recovery delay can cause the OS to reboot the GPU which will crash whatever processes are using it.
Thanks for your response. I added the registry setting and environment variable CUDA_VISIBLE_DEVICES, but it doesn't seem to make a difference and the issue is still happening.