Train Deep learning Model.

epiewesner · ‎09-16-2022

Greetings. I am trying to train my data using the deep learning model, but i keep getting this error message. Can someone help me out.

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 308, in execute
data_bunch = prepare_data(in_folders, working_dir=out_folder, **prepare_data_kwargs)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\_data.py", line 1440, in prepare_data
raise Exception(
Exception: Could not infer dataset type. Please specify a supported dataset type or ensure that the path contains valid esri files

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 384, in execute
del data_bunch
UnboundLocalError: local variable 'data_bunch' referenced before assignment

Failed script (null)...
Failed to execute (TrainDeepLearningModel).

PavanYadav · ‎09-16-2022

Glad to hear that you fixed the first issue? Can ask how you fixed it?
For the second issue you can look at the the stats.txt file. Please see if you have any classes that don't have samples.

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

View solution in original post

epiewesner · ‎09-16-2022

@PavanYadav the issues was with my training samples. I was adding only the image_chips. But when i added all the data it started working.

View solution in original post

PavanYadav · ‎09-16-2022

Hello @epiewesner

In the above error, I see this: "Could not infer dataset type. Please specify a supported dataset type or ensure that the path contains valid esri files"

Looks like the Training Deep Learning Model tool is finding some issues with the data that you are trying to use. A few things:

1. Are you using ArcGIS Pro or other application maybe one of Notebook examples outside of ArcGIS Pro? If using a notebook and if it's one of Esri's examples, can you please share the link? You might want to see this if using notebook example - https://community.esri.com/t5/arcgis-pro-questions/model-training-failure-in-arcgis-pro/td-p/1031634

2. If using ArcGIS Pro, what version are you currently on? at what version the data was exported (assuming you use the Export Training Data tool)?

3. Any changes made to the .json or any other files of the data?

4. Is the language and regional settings are same between now and when the data was exported?

Thanks

Pavan

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

epiewesner · ‎09-16-2022

Hi @PavanYadav, thank you for the help. But i got it sorted, unfortunately, i have ran into another issue. Hopping you can help me. Below is the new error message i am getting.

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 334, in execute
training_model_object.fit(
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 902, in fit
lr = self.lr_find(allow_plot=False)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 721, in lr_find
raise e
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 718, in lr_find
self.learn.lr_find()
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\train.py", line 40, in lr_find
epochs = int(np.ceil(num_it/len(learn.data.train_dl))) * (num_distrib() or 1)
ZeroDivisionError: division by zero

PavanYadav · ‎09-16-2022

Glad to hear that you fixed the first issue? Can ask how you fixed it?
For the second issue you can look at the the stats.txt file. Please see if you have any classes that don't have samples.

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

epiewesner · ‎09-16-2022

@PavanYadav the issues was with my training samples. I was adding only the image_chips. But when i added all the data it started working.

KevinRathgeber1 · ‎11-12-2022

@epiewesner What do you mean by "all the data". I am having same division by zero error - I have added the folder that contains the images and labels and the stats.txt, maps.txt, esri_accumulated_stats.json and exri_model_definition.emd files. Can't seem to figure out how to get around that error.

PavanYadav · ‎11-14-2022

@KevinRathgeber1 right above your post, please see @epiewesner's response.
The Export Training Data for Deep Learning tool typically outputs a folder containing something like the following

In this case, the input to the Train Deep Learning Model tool is: "FormatTest_PNG_KITTI_rectangles" folder.

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

KevinRathgeber1 · ‎11-14-2022

My bad, you are correct. I reread the above and realised they were getting the Divide by zero error after they pointed to the correct folder. That is the problem I am having. I have pointed to the folder as you have pointed out above, then when I run it I get the exact same errors.

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\Toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\Toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 334, in execute
training_model_object.fit(
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 902, in fit
lr = self.lr_find(allow_plot=False)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 721, in lr_find
raise e
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 718, in lr_find
self.learn.lr_find()
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\train.py", line 40, in lr_find
epochs = int(np.ceil(num_it/len(learn.data.train_dl))) * (num_distrib() or 1)
ZeroDivisionError: division by zero

Failed script (null)...
Failed to execute (TrainDeepLearningModel).

PavanYadav · ‎11-15-2022

@KevinRathgeber1 Can you share how many training samples you have and what number are you using in the Validation %.?

Here is an example - let's say I have 4 samples and I use 10 in the % validation parameter.. so in this case numbers of samples for validation will be 0.4, which is not possible it should be a whole number.

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services

KevinRathgeber1 · ‎11-15-2022

Yeah that was the problem I had 3 samples and 10%. I added one more sample for 4 and then did two runs set to 50% and 75% and each of those worked.