Epochs in train deep learning model are not saved when computer errors occur

10-13-2021 07:00 AM
Labels (3)
New Contributor

I have ArcGIS Pro version 2.8.2.  My problem is, I'm running my model training for over 200 epochs.  That's the minimum I need to achieved convergence.  However, when there is a glitch in the computer environment (e.g., power surge, VPN disconnected, etc.), ArcGIS Pro cancels ALL epochs without allowing me to save the epochs so I can re-start where I left off.

Why isn't there such a feature?  I've restarted twice from scratch each time.  It takes about 4 days to run 100 epochs.  Is it too much to ask ESRI to have this feature:  restart from the epoch I left off?

Appreciate any help.

0 Kudos
0 Replies