Hi, I've been trying to teach myself the workflow of using the deep learning tools in ArcPro 2.8. I have installed the deep learning libraries for 2.8 as instructed on the GitHub site here. My set up is ArcPro 2.8.1 and my PC is a modern i7, 32gb computer with an nvidia 3070 RTX graphics cards.
I decided the best way to learn was to replicate a task I had done recently but using the deep learning tools. I have a Sentinel 2 image and I want deep learning to identify bare/ploughed fields in a small area of the UK. Originally I thought this was object detection and followed the workflows described in the ArcPro help file. ArcPro was crashing and reporting errors so I almost gave up. But having read this page I realised object detection was about putting those rectangles around what it thinks it has found. I want the actual boundary of the field identified and have realised it is pixel classification I need to be doing.
Either the process inconsistently fails with an error (something about a bad token) or it simply creates a blank raster when I run the Classify pixels using Deep learning tool. I typically accept all the default settings as I don't know any better! I find setting the tools to use a GPU runs longer\errors\crashes than if I set it to CPU. This thread hints to the fact that these deep learning libraries are not compatible with an RTX 3070 GPU?
So let me talk you through my steps, may be you will spot the many "school boy errors" I am making? I'm a complete novice to this branch of image analysis and fully accept I'm doing something daft!
I run the train deep learning tool with the environment settings set to CPU and 100% parallel processing. The main dialog interface is left as is:
The message dialog reports this (I will be honest, this is all meaningless to me as I don't yet fully understand all the nuisances of the tool). But I'm OK with that as I am just trying to learn and get anything out of it!Start Time: 21 July 2021 12:12:34
So I think I'm doing the right sequence Prepare raster> create training samples > export > train > detect and I think I'm using the correct type of classification (pixel classification not object detection) but as you can see nothing works!
If anyone has any advice I am desperate to hear from you, even if I have done a dumb thing that any hardened deep learner would intuitively know.
Yes, there are known issues with the RTX 3xxx series of cards which are most likely the cause of your problems.
On your advise I have added to the list of ever increasing RTX users over on the GitHub website issue tracker page.
So... Dumb question? If I explicitly use the CPU in the environment setting should these problems go away as I'm not using the GPU? Or are the deep learning libraries using this CUDA 10.1 anyway just not making use of the GPU?
Also having reviewed my question was I doing the right things in the right order and if the RTX issue was not an issue you would have expected to see bare fields being picked out in the classification?
Finally what's your opinion on #5 about the arguments?
Normally when you train a model via the Python API or a notebook first you use the prepare_data function. One of the parameters in it is the chip_size. Because the geoprocessing tool covers all the training steps, maybe it is pulling in the chip_size argument as well? There's no real way to tell.
For CPU working even if the GPU doesn't for your card - I really don't know. But we should be able to test one way or the other.
In regards to your sequence and inputs, everything generally looks ok. I will do another reply with an example.
I haven't really done much pixel classification so decided to do a model similar to yours to try it out. I started with Sentinel2 imagery, and extracted just the RGB bands into a new tif file. Created some bare earth polygons to train with and exported the training data. I used classvalue 1 and classname earth.
I prefer to train using a notebook, because you get a lot more control and can see what is happening throughout the process. In Pro, just go to Insert in the ribbon and choose New Notebook. First step - import modules and setup training data. Then use the show_batch function to check your training data. What you want to see here is that your polygons are showing up on the images, like the one on the right.
Next is to setup the model and find the learning rate.
I just chose 1e-04 (or 0.0001 as the learning rate). Next start training the model, I just chose 10 epochs to start with (ignore the 50 here, I will explain why in a minute).
Now use the show_results function to see how the model is training. On the left is the ground truth from your training data, on the right is the current results from the model. You can see the model is starting to recognise the bare earth. After 10 epochs I couldn't see any results, so I trained it for 50 more epochs with the results below. In a notebook it's easy because you can just go back up to the previous cell, change the number of epochs and run it again (it will continue training from where it left off). Obviously the results are still not great, but you can see it is actually working.
Next step is to save the model to disk.
Now I went ahead and used the Classify Pixels tool on the same image. You can see that it is actually detecting the bare earth in purple with a pixel value of 1. So everything is working, the model just needs to get better.
See if you can follow the above just using your CPU and if you get any results. You may need to force the notebook to use the CPU with the instructions from here: How force Pytorch to use CPU instead of GPU?
Really appreciate your time in helping me.
I have been shying away from notebook, not because I can't use python but because I am very much a fish out of water with deep learning. So I took the approach if I can just get it to work in the tools (and ESRI do good tool interfaces) I might have a chance at understanding it. I've spent so long bashing my head against the tools that unbelievable some of it has sunk in so reading your notebook approach I understood! I'm going to go away and have a tinker and will report back.
I recently came across Google Colab a notebook python environment. If I can get your notebook working with just a CPU to avoid all the issues with my RTX card I was wondering if the logic could be migrated into the colab environment as I understand that they offer up a GPU. Anyway 1 step at a time and I hope ESRI resolve the incompatibility with the RTX cards soon.
So I was able to follow your notebook instructions and turn off CUDA to force only CPU and then complete the training. Good news it did not crash or return an error but it did take many hours to process and eventually failed to classify anything. I then clipped back the original Sentinel 2 raster so it was not so big, created a load of polygons in this smaller raster and when I exported the data I reduced the tile size to 128 and stride to 32. Went through the training as before and when I finally did the classification...nothing was identified!
So I studied your notebook book again and the only obvious difference between your inputs and mine are that you had drawn rectangles within bare fields whilst I had drawn around the edge. So I rebuilt my training sample as rectangles in bits of bare field, went through the whole training section, bumped the number of epochs from 20 to 30 when you call the unet.fit() function and finally when I run the classify pixels tool I see something (yellow pixels) as shown below!
Why would drawing crude rectangles in only parts of fields seemingly work better than defining the actual edge?
Also what would you now do to improve this workflow so that it better identifies bare fields because the results are currently quite poor. Do I need to draw lots more training rectangles, increase the number of epochs? Are there other tweaks you know?
That's good news Duncan, you got the easy part out of the way! So it looks like CPU does work for the RTX3xxx cards, at least for Unet and Pixel Classification.
There should not be any difference for the rectangles vs full polygons - maybe they just needed more training to work?
You have already identified the 2 most practical ways to improve results:
One final tip. If you have saved a model then come back later and want to continue training it, then you should choose the saved model in the Advanced\Pre-Trained Model parameter in the Training geoprocessing tool.
If using a notebook rather than:
unet = UnetClassifier(data)
You would use:
unet = UnetClassifier.from_model(r'<path_to_model_emd_file>',data)
And the training data doesn't need to be the same either - so you can take your current model and just keep training it with other data (so your training time was not wasted).
For pixel Classification , When trying to train with sparse data , that is training data doesnot cover the entire image , for better results you can set the ignore_class parameter in the train tool to 0. This ignore all the pixels that have not been collected as training samples.
The chip size is the size that is used for clipping the image for training. The default is 224 and the python api uses the same.
I was quite excited by your additional bit of information but when I include it in the parameter list during the train model part of my notebook code as shown below:
unet = UnetClassifier(data, backbone='resnet34', ignore_classes=)
I get this response...
--------------------------------------------------------------------------- Exception Traceback (most recent call last) In : Line 1: unet = UnetClassifier(data, backbone='resnet34', ignore_classes=) File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\learn\models\_unet.py, in __init__: Line 124: raise Exception(f"`ignore_classes` parameter can only be used when the dataset has more than 2 classes.") Exception: `ignore_classes` parameter can only be used when the dataset has more than 2 classes. ---------------------------------------------------------------------------
So I'm guessing your sparse data actually had more than one class?