Select to view content in your preferred language

Error when training deep learning model

967
7
Jump to solution
11-25-2023 01:55 AM
Labels (1)
IvanVanchugov
New Contributor II

I'm trying to train my model and I'm getting the following error:

Traceback (most recent call last):
  File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 390, in <module>
    execute()
  File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 334, in execute
    training_model_object.fit(
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 902, in fit
    lr = self.lr_find(allow_plot=False)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 721, in lr_find
    raise e
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 718, in lr_find
    self.learn.lr_find()
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\train.py", line 41, in lr_find
    learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\basic_train.py", line 200, in fit
    fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\basic_train.py", line 99, in fit
    for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastprogress\fastprogress.py", line 47, in __iter__
    raise e
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastprogress\fastprogress.py", line 41, in __iter__
    for i,o in enumerate(self.gen):
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\basic_data.py", line 75, in __iter__
    for b in self.dl: yield self.proc_batch(b)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__
    data = self._next_data()
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\torch\utils\data\dataloader.py", line 557, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\data_block.py", line 657, in __getitem__
    if self.item is None: x,y = self.x[idxs],self.y[idxs]
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\data_block.py", line 120, in __getitem__
    if isinstance(idxs, Integral): return self.get(idxs)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\fastai\vision\data.py", line 271, in get
    res = self.open(fn)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgis\learn\models\_maskrcnn_utils.py", line 144, in open
    self.index_dir[self.inverse_class_mapping[fn[k].parent.name]]
KeyError: 'Class_0'
Failed script (null)...
Failed to execute (TrainDeepLearningModel).

 

 

Here are the parameters I'm using to train the model on a sample area:

1_.png

2_.png

  

I have spent quite a lot of time, maybe you will have ideas how to fix it.

Perhaps someone has already encountered something similar.

0 Kudos
1 Solution

Accepted Solutions
PriyankaTuteja
Esri Contributor

@IvanVanchugov Please provide the following details for our team to better understand the issue:

1. ArcGIS Pro Version

2. A small sample of your training data for reproducing issue

3. What is class_0 in your data? Is it the name of a class?

View solution in original post

7 Replies
PavanYadav
Esri Contributor

@IvanVanchugov 
I understand you're using MaskRCNN model type. What is the metadata type of your training data? 

This may help https://pro.arcgis.com/en/pro-app/latest/tool-reference/image-analyst/overview-of-the-deep-learning-... 

Thanks

Pavan

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services
IvanVanchugov
New Contributor II

Yes, I'm using MaskRCNN model type and I exported all my data on RCNN masks.

IvanVanchugov_0-1701160641540.png

I tried to follow the instructions from this video:

Deep Learning Object Detection Workflow in ArcGIS Pro - YouTube

However, it did not say about possible errors and how to solve them(

0 Kudos
PriyankaTuteja
Esri Contributor

@IvanVanchugov Please provide the following details for our team to better understand the issue:

1. ArcGIS Pro Version

2. A small sample of your training data for reproducing issue

3. What is class_0 in your data? Is it the name of a class?

IvanVanchugov
New Contributor II

1) Yes (version 3.0.2);

2) We have more then 70 classes of data (I wrote about the video above, it's not clear if this is enough or not)

3) Yes, it was the name of a class, but I deleted it and now I have this:

IvanVanchugov_1-1701160943889.png

 

I also solved problems with installing CUDA and so on to be sure that I would be able to run this training module:

IvanVanchugov_2-1701161101894.png

 

0 Kudos
PriyankaTuteja
Esri Contributor

Where have you deleted this class_0 from that it made the tool run successfully?

IvanVanchugov
New Contributor II

Yes, thank you for your help!

It did it successfully!)

But, after 3 days, when it finally completed I have only 6 detected objects...

IvanVanchugov_0-1701437253724.png

Maybe I don't have enough data to train the model (there were about 80 of them). I have an assumption that it is necessary to give about 1000 classes for training.

This is my model_metrics file:

IvanVanchugov_1-1701437508842.png

IvanVanchugov_2-1701437538902.png

IvanVanchugov_3-1701437550068.png

 

 

 

PavanYadav
Esri Contributor

@IvanVanchugov I wrote the following two blogs on training data creation; you might find them helpful: 

Tips for labeling images for object detection models

Tips for training data preparation for object detection models

I understand you're using maskrcnn but some of the tips might be still useful. 

Thanks

Pavan 

Pavan Yadav
Product Engineer at Esri
AI for Imagery
Connect with me on LinkedIn!
Contact Esri Support Services