Train Deep learning Model.

2725
11
Jump to solution
09-16-2022 03:51 AM
Labels (2)
epiewesner
New Contributor II

Greetings. I am trying to train my data using the deep learning model, but i keep getting this error message. Can someone help me out.

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 308, in execute
data_bunch = prepare_data(in_folders, working_dir=out_folder, **prepare_data_kwargs)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\_data.py", line 1440, in prepare_data
raise Exception(
Exception: Could not infer dataset type. Please specify a supported dataset type or ensure that the path contains valid esri files

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 384, in execute
del data_bunch
UnboundLocalError: local variable 'data_bunch' referenced before assignment

Failed script (null)...
Failed to execute (TrainDeepLearningModel).

0 Kudos
2 Solutions

Accepted Solutions
PavanYadav
Esri Contributor

Glad to hear that you fixed the first issue? Can ask how you fixed it?
For the second issue you can look at the the stats.txt file. Please see if you have any classes that don't have samples. 

View solution in original post

epiewesner
New Contributor II

@PavanYadav the issues was with my training samples. I was adding only the image_chips. But when i added all the data it started working.

 

View solution in original post

11 Replies
PavanYadav
Esri Contributor

Hello @epiewesner 

In the above error, I see this: "Could not infer dataset type. Please specify a supported dataset type or ensure that the path contains valid esri files"

Looks like the Training Deep Learning Model tool is finding some issues with the data that you are trying to use. A few things: 

1. Are you using ArcGIS Pro or other application maybe one of Notebook examples outside of ArcGIS Pro? If using a notebook and if it's one of Esri's examples, can you please share the link? You might want to see this if using notebook example -  https://community.esri.com/t5/arcgis-pro-questions/model-training-failure-in-arcgis-pro/td-p/1031634

2. If using ArcGIS Pro, what version are you currently on? at what version the data was exported (assuming you use the Export Training Data tool)? 

3. Any changes made to the .json or any other files of the data?

4.  Is the language and regional settings are same between now and when the data was exported?

 

Thanks

Pavan

0 Kudos
epiewesner
New Contributor II

Hi @PavanYadav, thank you for the help. But i got it sorted, unfortunately, i have ran into another issue. Hopping you can help me.  Below is the new error message i am getting.

 

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 334, in execute
training_model_object.fit(
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 902, in fit
lr = self.lr_find(allow_plot=False)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 721, in lr_find
raise e
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 718, in lr_find
self.learn.lr_find()
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\train.py", line 40, in lr_find
epochs = int(np.ceil(num_it/len(learn.data.train_dl))) * (num_distrib() or 1)
ZeroDivisionError: division by zero

0 Kudos
PavanYadav
Esri Contributor

Glad to hear that you fixed the first issue? Can ask how you fixed it?
For the second issue you can look at the the stats.txt file. Please see if you have any classes that don't have samples. 

epiewesner
New Contributor II

@PavanYadav the issues was with my training samples. I was adding only the image_chips. But when i added all the data it started working.

 

KevinRathgeber1
New Contributor III

@epiewesner  What do you mean by "all the data".  I am having same division by zero error - I have added the folder that contains the images and labels and the stats.txt, maps.txt, esri_accumulated_stats.json and exri_model_definition.emd files.  Can't seem to figure out how to get around that error.

 

0 Kudos
PavanYadav
Esri Contributor

@KevinRathgeber1  right above your post, please see @epiewesner's response. 
The Export Training Data for Deep Learning tool  typically outputs a folder containing something like the following 

PavanYadav_0-1668449286411.png

In this case, the input to the Train Deep Learning Model tool is: "FormatTest_PNG_KITTI_rectangles" folder. 

0 Kudos
KevinRathgeber1
New Contributor III

My bad, you are correct.  I reread the above and realised they were getting the Divide by zero error after they pointed to the correct folder.  That is the problem I am having.  I have pointed to the folder as you have pointed out above, then when I run it I get the exact same errors.

Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\Toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\Toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 334, in execute
training_model_object.fit(
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 902, in fit
lr = self.lr_find(allow_plot=False)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 721, in lr_find
raise e
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 718, in lr_find
self.learn.lr_find()
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\train.py", line 40, in lr_find
epochs = int(np.ceil(num_it/len(learn.data.train_dl))) * (num_distrib() or 1)
ZeroDivisionError: division by zero

Failed script (null)...
Failed to execute (TrainDeepLearningModel).



0 Kudos
PavanYadav
Esri Contributor

@KevinRathgeber1 Can you share how many training samples you have and what number are you using in the Validation %.?

Here is an example - let's say I have  4 samples and I use 10 in the % validation parameter.. so in this case numbers of samples for validation will be 0.4, which is not possible it should be a whole number. 

0 Kudos
KevinRathgeber1
New Contributor III

Yeah that was the problem  I had 3 samples and 10%.  I added one more sample for 4 and then did two runs set to 50% and 75% and each of those worked.

0 Kudos