Hi,
I am getting an error message while running Train Deep Learning Model.
ExecuteError: Traceback (most recent call last):
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 232, in <module>
execute()
File "c:\program files\arcgis\pro\Resources\ArcToolbox\toolboxes\Image Analyst Tools.tbx\TrainDeepLearningModel.tool\tool.script.execute.py", line 196, in execute
training_model_object = training_model.from_model(pretrained_model_path, data_bunch)
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\arcgis\learn\models\_classifier.py", line 360, in from_model
return cls(data, **model_params, pretrained_path=str(model_file))
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\arcgis\learn\models\_classifier.py", line 164, in __init__
self.load(pretrained_path)
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 1300, in load
raise e
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\arcgis\learn\models\_arcgis_model.py", line 1298, in load
self.learn.load(name, purge=False)
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\fastai\basic_train.py", line 281, in load
get_model(self.model).load_state_dict(state, strict=strict)
File "C:\Users\S0003051\AppData\Local\ESRI\conda\envs\dl-python\lib\site-packages\torch\nn\modules\module.py", line 830, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Sequential:
size mismatch for 1.8.weight: copying a param with shape torch.Size([3, 512]) from checkpoint, the shape in current model is torch.Size([2, 512]).
size mismatch for 1.8.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]).
Failed to execute (TrainDeepLearningModel).
Process:
I have 13 TIF images and collected image labels into feature layers. These layers are used to export to image chips (TIF) into designated directories by original imagery naming scheme, and they are recursively trained individually using a pre-trained *.dlpk file from the previous *.dlpk file. The following code shows the steps that cause an error:
pretrained = "20200530"
current = "20200627"
chips = r"D:\Data\DeepLearning\Training\Proj_{}_TIF".format(current)
model = r"D:\Data\DeepLearning\Model\Proj_{}".format(current)
pre_model = r"D:\Data\DeepLearning\Models\Proj_{0}\Proj_{0}.dlpk".format(pretrained)
arcpy.ia.TrainDeepLearningModel(chips, model, 20, "FEATURE_CLASSIFIER", 2, "chip_size 256", None, "RESNET34", pre_model, 10, "STOP_TRAINING", "UNFREEZE_MODEL")
pretrained = current
This uses a feature classifier to train the image in order to classify trained labels. The initial training always works, but the secondary training (that uses pre-trained) fails. All training image chips are derived from a feature layer that has the same label schema from the same *.ecs file.
I appreciate it if there is a workaround to fix this issue.
Thanks.
Shingo Ikeda
Geospatial Data Scientist/Developer - Geographical Information Platform
Global Power Generation - Digital Satellite USA and Canada