Help regarding pixel based classification with deep learning

GeoprocessamentoCerradinho · ‎03-15-2021

I'm a little bit maybe very confused about the workflow for pixel-based image sorting with deeplearning.

I tried the following steps:

1 - I generated a 3-band RGB mosaic from Sentinel-2 MSIL2A
2 - I collected the samples
3 - I exported for training

Here the first confusion, I noticed that doing so I could not do pixel-based classification, only objects, so algorithms like U-NET were not available.

I did the same sampling process (using the samples I had collected), performed the image segmentation and finally applied the SVM classifier. So, with the classified raster on the hands (I think this was what I had miss), I tried to export the training data as pixel space but I got the warning message saying that the type only allowed map space, I continued anyway.

After generating the training data, I went to train the model, now U-Net is now available (hurray!). With the trained model I went to test it on the RGB composition of Sentinel-2, I received a warning that it would not be possible to use pixel space but map space would be used and after a few minutes I got an empty (frustrating!) Raster as a result.

I thought it might be the way I generated the classified raster to enter as a training sample (object based - because only then could I use segmentation), so I generated a new classified raster of the pixel-based type, I used the pixel editor to make some corrections in the result, and when trying to export the training samples I received the error message saying that the raster should be thematic, when looking at the properties it appears as generic.

Can anyone help me understand where I'm going wrong in this whole process?

AmlanR · ‎03-15-2021

Hii @GeoprocessamentoCerradinho ,

When exporting training samples from the imagery, did you specify a metadata format?

Assuming you are using the latest version of ArcGIS Pro (2.7.2), the default metadata format is PascalVOC which is used for object detection. Therefore, if the PascalVOC metadata format is used, the only available model types under 'Train Deep Learning Model' tool will be the ones that support Object Detection.

For Pixel Classification, we need to select 'Classified Tiles' and U-Net should show up in the next step.

Details on the different metadata types can be found here:
Export Training Data For Deep Learning (Image Analyst)—ArcGIS Pro | Documentation

Is there a specific reason for picking image space over the default map space?

https://pro.arcgis.com/en/pro-app/latest/help/analysis/image-analyst/what-is-image-space-analysis-.h...

Image classification using Deep learning is just another way of achieving the same goal (in your case, a classified raster) as the the machine learning workflow using SVM or Random forests etc.

I am not sure I completely understand the step of creating a classified raster and then feeding it to the deep learning tools as an input. If we used the classification tools using machine learning, then it should give you a classified raster as an output (which I believe is the end goal in this case) so what would be the purpose of using Deep learning tools here?

When you mention 'empty' raster, do you mean the output is complete black ? Do the individual pixels have any values?

Simply put, here are the steps that are required for an end-to-end deep learning workflow (for pixel based classification)

1. In 'Export training data for Deep learning' GP tool, provide the raster from which samples need to be collected. Provide either the feature class that has the class values in the attribute table OR provide a classified raster, if available. Make sure to select the correct metadata format.

2. Provide the exported training samples to the 'Train Deep Learning Model' GP tool and U-Net should show up. Train the model.

3.Perform Inferencing using the 'Classify Pixels using Deep Learning' GP tool.

The accuracy of the end results depend on a multitude of factors such as the resolution of data, number of samples collected, the convergence of the model etc.

Let me know if this makes sense. If you need further help you may also create a case with Esri Support and I will be able to take a look at your data and walk you through the process.

View solution in original post

AmlanR · ‎03-15-2021

Hii @GeoprocessamentoCerradinho ,

When exporting training samples from the imagery, did you specify a metadata format?

Assuming you are using the latest version of ArcGIS Pro (2.7.2), the default metadata format is PascalVOC which is used for object detection. Therefore, if the PascalVOC metadata format is used, the only available model types under 'Train Deep Learning Model' tool will be the ones that support Object Detection.

For Pixel Classification, we need to select 'Classified Tiles' and U-Net should show up in the next step.

Details on the different metadata types can be found here:
Export Training Data For Deep Learning (Image Analyst)—ArcGIS Pro | Documentation

Is there a specific reason for picking image space over the default map space?

https://pro.arcgis.com/en/pro-app/latest/help/analysis/image-analyst/what-is-image-space-analysis-.h...

Image classification using Deep learning is just another way of achieving the same goal (in your case, a classified raster) as the the machine learning workflow using SVM or Random forests etc.

I am not sure I completely understand the step of creating a classified raster and then feeding it to the deep learning tools as an input. If we used the classification tools using machine learning, then it should give you a classified raster as an output (which I believe is the end goal in this case) so what would be the purpose of using Deep learning tools here?

When you mention 'empty' raster, do you mean the output is complete black ? Do the individual pixels have any values?

Simply put, here are the steps that are required for an end-to-end deep learning workflow (for pixel based classification)

1. In 'Export training data for Deep learning' GP tool, provide the raster from which samples need to be collected. Provide either the feature class that has the class values in the attribute table OR provide a classified raster, if available. Make sure to select the correct metadata format.

2. Provide the exported training samples to the 'Train Deep Learning Model' GP tool and U-Net should show up. Train the model.

3.Perform Inferencing using the 'Classify Pixels using Deep Learning' GP tool.

The accuracy of the end results depend on a multitude of factors such as the resolution of data, number of samples collected, the convergence of the model etc.

Let me know if this makes sense. If you need further help you may also create a case with Esri Support and I will be able to take a look at your data and walk you through the process.

GeoprocessamentoCerradinho · ‎03-15-2021

"I am not sure I completely understand the step of creating a classified raster and then feeding it to the deep learning tools as an input. If we used the classification tools using machine learning, then it should give you a classified raster as an output (which I believe is the end goal in this case) so what would be the purpose of using Deep learning tools here?"

I totally agree and it seemed totally non-sense, but as I was at the base of the trial error I went this way

When you mention 'empty' raster, do you mean the output is complete black ? Do the individual pixels have any values?

Yes, with no values.

I am following the steps as you instructed, training using vector samples is much more faster. I'll see if the generated model will return an empty raster, the result will be posted here soon. Thanks for the support.