Select to view content in your preferred language

Pool Object Detection Using Pre-trained Model

201
2
4 weeks ago
Labels (1)
DimitrisPsarologos
New Contributor

For research purposes, I aim to detect the number and estimate the shape area of swimming pools on Rhodes Island using the pre-trained deep learning model Pool Segmentation - USA. However, I am currently facing challenges related to both the accuracy of the detection results and the processing time of the input data. Below, I outline the full workflow I’m following:

Step 1: Data Preparation

Due to the lack of high-resolution imagery in a format compatible with the pre-trained model, I am using World Imagery Wayback basemaps to manually export imagery in .tpkx or .tif format for areas where pools are visually identified.

  • When exporting in .tpkx, I convert the files to 3-band 8-bit TIFFs using a Python notebook.
  • After collecting all the relevant .tif files, I use the Mosaic to New Raster tool in ArcGIS Pro to merge the inputs into a single raster dataset. This prepares the data for model inference.

Step 2: Running the Model

Once the raster is ready, I use the Detect Objects Using Deep Learning tool in ArcGIS Pro:

  • Input: the merged .tif raster (3-band, 8-bit), which in my case is ~2.3 GB.
  • Model: Pool Segmentation - USA with default parameters.
  • Processor type: GPU
  • Hardware: I run the model on a Virtual Machine with the following specs:
    64 GB RAM, Intel Xeon Gold 5220R CPU, and NVIDIA A10-12Q GPU.

DimitrisPsarologos_0-1753359525662.png

 

DimitrisPsarologos_1-1753359525666.png

 

Issues Encountered

  1. Accuracy: In tests with smaller input areas, I noticed that the model often fails to detect several visible pools.
  2. Performance: Despite utilizing a GPU, processing the full mosaic raster takes a significant amount of time, or in some cases, the model unexpectedly fails to run altogether.

 

Request for Suggestions

Do you have any recommendations to improve either:

  • The data preparation process (e.g., optimal input resolution, format, preprocessing), or
  • The model inference step (e.g., parameter tuning, tiling, hardware optimization)
    in order to increase the efficiency and accuracy of the final outputs?
0 Kudos
2 Replies
PavanYadav
Esri Regular Contributor

hi @DimitrisPsarologos I have reported this to my team and hope to have a response soon. thanks!

 

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services
0 Kudos
PriyankaTuteja
Esri Contributor

Hello @DimitrisPsarologos  

Thank you for reaching out! I have a few follow-up questions based on the description you provided:

  1. What is the resolution of the input raster you’re using for inferencing with the pool segmentation model?

  2. Why did you check the Use pixel space option? The Wayback imagery you used should already be geo-referenced, so it can be processed in Map Space without selecting pixel space. Could you confirm if you intentionally enabled this?

  3. You mentioned that the tool errors out in some cases when run on the full image extent. Could you share the error trace for those cases?

In addition, I’d like to suggest a few steps to improve results:

  • Use the recommended cell size for the pool segmentation model instead of the default value.
  • Lower the threshold to around 0.2 to segment pools with lower confidence, and then apply a definition query over the threshold field to filter the results.
  • To reduce processing time, providing the cell size should help.
  • If the error you encountered is a CUDA out-of-memory issue, try lowering the batch size from 64 to 4 or 8 — this should help resolve it.
0 Kudos