Train deep learning model won't quit

MarkSchweder · ‎10-12-2025

My train deep learning model won't quit after 2 and-a-half days. Which is a day longer than it said it would take to complete the task. Is it possible to stop the program and recover what it has done so far? Should I wait for it to finish? Should I cancel and start again? Something else?

Thanks, Mark

DanPatterson · ‎10-12-2025

Did you try it with a smaller dataset to confirm the process?

Details on the input data type, location, extents and size would be useful as would anything about the destination parameters.

... sort of retired...

MarkSchweder · ‎10-13-2025

Yes. its big data and that's the point. I'm trying to find out how big the data can be. It's been cut in half once and it looks like another time is necessary.

In any event, is it possible to stop the program and recover what it has done so far?

Thanks

RTPL_AU · ‎10-13-2025

@MarkSchweder
Is it using the correct GPU?

MarkSchweder · ‎10-13-2025

yes

Robert_LeClair · ‎10-14-2025

If you click the Environments tab on the GP tool, click GPU for the Processor Type dropdown and for the GPU ID to zero.

I'm running ArcGIS Pro 3.5.4. On the parameters tab, under the Data Preparation dropdown for the Data Augmentation parameter, change the batch size to 16. Are you using an earlier version of ArcGIS Pro or is there any parameter for batch side?

Does these changes improve the performance?

MarkSchweder · ‎10-15-2025

I've had to reduce not only the complexity but also the quantity of data used to train the model to keep things moving. I'm going to try combining smaller models to regain the complexity and quantity.

RichardDaniels · ‎10-15-2025

If you get to the point where it starts writing data, it can take a long time (days) if you are reading and writing on the same disk. Ensure you are reading and writing results from/to different disks to avoid 'race' conditions.