Hello Everyone.
I am working on an object detection lab for a GIS class that I teach and am having issues with the process actually detecting objects.
I have followed a number of Esri videos on YouTube and read some of the blogs about the object detection workflow, but it seems that I might be missing something important in my own process, which I have outlined below.
DATA: 2023 USDA NAIP Aerial Imagery
One: Label Objects for Deep Learning (tool)
I originally started with 20 objects and have increased that number to 125.
Two: Export Training Data within the Label Objects for Deep Learning tool
I use the RCNN Masks option for the Meta Data Format
Three: Train Deep Learning Model (tool)
The MaskRCNN model type is automatically inserted and I used 100 Epochs.
Four: Detect Objects Using Deep Learning (tool)
I am including a screenshot of the most recent process.
As you can see from the above screenshot, out of 125 training samples it was only able to correctly identify five trees that I trained it to identify and one that was not part of the training dataset.
Is there a strategy to drawing the polygons around the objects that I might be missing in the training phase? I used this video as a reference:
https://www.youtube.com/watch?v=g0FDARaciiI
And the polygons around the boats seem to encompass both the boats and the surrounding water.
Anyhow, I want to show my class this cool technique, but would like it to be more robust than it is currently working.
Thanks for any help, and thank you for reading this post.
When labelling the training sample, it is better to zoom into a small area and comprehensively label all your desired objects (e.g.individual trees in your case) without any missing . Then you only export the comprehsnively labelled area to your training data by setting extent in the environment tab. Train your own model with just the small area and then you can detect the trees of the entire image with the trained model. The key is to ensure you mark out all the desired objects for your training area.
You may reference to this blog https://www.esri.com/arcgis-blog/products/arcgis-pro/geoai/tips-for-labeling-images-for-object-detec...
Hi @GarrettRSmith, I have few points which can be helpful for you.
Please let me know if you need any other help.
One key difference between traditional remote sensing classification approaches and deep learning models is that in traditional classification methods, we often rely on a small number of pure pixel samples (e.g., 20-30) to train the model. These methods typically assume that the spectral signature of each class is distinct and sufficient for classification. In contrast, deep learning models, particularly convolutional neural networks (CNNs), do not just learn from the spectral information but also capture higher-level features like shape, texture, and contextual relationships within the image. As a result, deep learning models require larger and more diverse datasets with labeled examples, as they need to learn more complex patterns and relationships that go beyond simple pixel-based information."