For research purposes, I aim to detect the number and estimate the shape area of swimming pools on Rhodes Island using the pre-trained deep learning model Pool Segmentation - USA. However, I am currently facing challenges related to both the accuracy of the detection results and the processing time of the input data. Below, I outline the full workflow I’m following:
Step 1: Data Preparation
Due to the lack of high-resolution imagery in a format compatible with the pre-trained model, I am using World Imagery Wayback basemaps to manually export imagery in .tpkx or .tif format for areas where pools are visually identified.
Step 2: Running the Model
Once the raster is ready, I use the Detect Objects Using Deep Learning tool in ArcGIS Pro:
Issues Encountered
Request for Suggestions
Do you have any recommendations to improve either:
hi @DimitrisPsarologos I have reported this to my team and hope to have a response soon. thanks!
Hello @DimitrisPsarologos
Thank you for reaching out! I have a few follow-up questions based on the description you provided:
What is the resolution of the input raster you’re using for inferencing with the pool segmentation model?
Why did you check the Use pixel space option? The Wayback imagery you used should already be geo-referenced, so it can be processed in Map Space without selecting pixel space. Could you confirm if you intentionally enabled this?
You mentioned that the tool errors out in some cases when run on the full image extent. Could you share the error trace for those cases?
In addition, I’d like to suggest a few steps to improve results:
I tried a different approach in the data management and running the model, including your suggestions
Let me, be more specific
First of all, i tried to merge all the tiffs with Mosaic data management tool, instead of Mosaic to New Raster and i created a new merged tiff.
Here is the new Raster information:
Columns: 177969
Rows: 260958
Number of Bands: 3
Cell Size X: 0.2985821416443992
Cell Size Y: 0.2985821416444002
Uncompressed Size: 129.76 GB
Format:TIFF
Source Type: Generic
Pixel Type: unsigned char
Pixel Depth: 8 Bit
NoData Value: 256, 256, 256
Colormap: absent
Pyramids levels: 8, resampling: Nearest Neighbor
Compression: LZW
Mensuration Capabilities: Basic
Secondly, i run the model in different smaller extents instead of the whole area, and then i merged the output layers
For the settings of the model i changed:
-The cell size to 0.3 as it is recommended from the documentation
- The batch size to 4
- Deactivated the Use Pixel space
- Test time augmentation to false to reduce the time of each process
The output of the model i manually clean it by deleting the false positives.
Also in during the process of cleaning i noticed many false negatives.
Specifically the model found 1300+ pools, which 1130 of them was actually pool. Also the estimated pool number in the case study area is approximately 2000.
For the next step I'm considering to utilize the false negatives to re-train the model for increasing it's accuracy from it's current 0.59
I really appreciate your opinion in the current process
Also i would like to ask, if there is any recommended number of samples to start with. What are you suggesting?
Thank you for your kind support