Hi,
So I've been teaching myself the basic workflow on how to use the deep learning tools in ArcPro. I've had some great help in my Q&A on pixel classification here. Feeling brave I am now having a go at object detection using the SingleShotDectector. I gave myself a task of identifying boats in an estuary and followed the basic workflow discussed on github here. By the way there are a few other really useful notebooks and once you get over the jargon barrier and the frightening complexity of deep learning these notebook are really helpful in teaching you a workflow.
So my first run through and it works! The red circles were my training samples and yellow boxes are the detections.
As you can clearly see a lot of boats are being missed. I understand that now I need to tweak the workflow. More training samples (red circles) and running the fit() for longer (more epochs) seem to be an obvious option but then there are all those other parameters...
When I define the classifier in notebook I used this:
ssd = SingleShotDetector(data, grids=[5], zooms=[1.0], ratios=[[1.0, 1.0]], backbone='resnet34', backend='pytorch')
One of the parameters is grids and it is actually discussed here. It's not at all clear what the implication of this grid value is and how it is used. Is choosing a grid value of say 100 better than 5, what's the impact of setting a higher or lower grid value? In the example they use a 4x4, 4x4 of what? Let me explain...
Imagine my satellite image is 10m resolution and is 100x100 pixels, so 10,000 pixel in total. Is setting this grid parameter to 4 mean that it is diving up the image into 4 50x50 pixel regions?
Should this grid parameter be set at a value that would match the tile size during the export training data, for example if I exported tiles at 25x25 I would pick a grid size of 16? Or is that irrelevant?
This parameter is a list so can take many values, why would one use multiple grid values?
I've have yet to find any advice on the ESRI site about why you would choose one grid size over another and how you choose an appropriate grid value, the api offers next to no explanation of this parameter.
I'm being very cheeky here and tagging you in @Tim_McGinnes as you were a super star in my other thread!
But any help from anyone is much appreciated.