Blank output - deep learning - detect objects

RyanHemmens · ‎10-17-2024

Hi everyone,

New to deep learning models, and a fairly novice GIS user. So apologies if I missing something obvious.

I am attempting to build a model to detect sick Grass trees xanthorrhoea. They show up as yellow in the image snippet attached.

I have gone through the work flow and trained a detect objects model on around 20 or so samples. When I run the model I am getting a blank output. Have tried to tweak the threshold between 0.9-0.1.

Is this a bit too fine scale for these kinds of detection models?

Cheers

PavanYadav · ‎10-31-2024

@RyanHemmens

Model Accuracy: Can you look at or share the model_metric.html report? This report shows the quality of the model.

Not too generic model: If you model's accuracy is reasonably good...please note iIf your training data and the images used for detecting objects are very different (in terms of resolution, color, or bit depth), you may see that the model may not perform well.

You can try to lower the threshold in the Detect Objects tool to see if it returns anything. Quality may not be that good and the model with a lower threshold may give you false detections.

In case you would like to refer to another tutorial, here is one - https://learn.arcgis.com/en/projects/use-deep-learning-to-assess-palm-tree-health/.

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

ShivaniPathak · ‎11-13-2024

Hi @RyanHemmens , one thing which I would like to highlight is that 20 samples are very less for training a deep learning model because:

One key difference between traditional remote sensing classification approaches and deep learning models is that in traditional methods, such as pixel-based classification, we often rely on a small number of pure pixel samples (e.g., 20-30) to train the model. These methods typically assume that the spectral signature of each class is distinct and sufficient for classification. In contrast, deep learning models, particularly convolutional neural networks (CNNs), do not just learn from the spectral information but also capture higher-level features like shape, texture, and contextual relationships within the image. As a result, deep learning models require larger and more diverse datasets with labeled examples, as they need to learn more complex patterns and relationships that go beyond simple pixel-based information."

You can also try training a DetReg model which can be trained on less samples in comparison to other deep learning object detection models. You can refer to this notebook which shows how DetReg can be used for training a palm tree detection model. Also note that Palm trees have well defined boundaries and their features are different from other trees that is why the model was trained on very small number of samples but Grass trees xanthorrhoea doesn't have well defined boundaries that is why it needs for data.

One more thing which you can try if the above approach doesn't work well is to train a segmentation model instead of objection detection. You can create masks for the Grass trees xanthorrhoea and export the data in Classified Tiles format. Train SamLORA model which can also be trained on less number of samples and achieve accuracy. You can refer this sample notebook for SamLORA workflow: https://developers.arcgis.com/python/latest/samples/finetuning-sam-for-flood-inundation-mapping/