I have a notebook script which runs fine manually, and runs okay on a scheduled Task most of the time. But a few times now the the Task will go off at its scheduled time and then get stuck 'Executing' perpetually. This causes each subsequent scheduled execution to get 'Skipped' (see attached screenshot). This doesn't ever seem to resolve on its own. I've let it go on for a few days to see what would happen, but it never resolves.
The only way I've found of fixing it is to delete the Task from the Notebook entirely. (I also clear and restart the kernel, but I'm not sure that's necessary to fix the issue or not).
I'm not really sure how to troubleshoot this further. I don't get an actual error code, since it is still 'Executing', so I have no idea where or on what it's getting hung up on. Again it runs fine manually, and I've seen it run up to a few dozen times before hitting this 'Executing' snag, so the problem seems inconsistent. I've started tracking the days and times it happens, but don't see a clear pattern yet, other than it might have something to do with the weekends.
Any ideas or tips on how to go about troubleshooting this, since I can't replicate the error when running it myself?
It may also be a beneficial enhancement to have something that either lets the user see where the script is at while a Task is running (so you could see where it is hung up, at least) or have an option to interrupt and restart the Task without having to delete and recreate it from scratch.
i've been having the same issue over the last few months, i have two Notebooks scheduled and every now and then they just get stuck.
At the moment one of them has been stuck for nearly 10 days, it would be nice if we could manually stop it or to be able to set a timeout
@Anonymous User did you manage to find anything from your investigations?
Thanks
Stu
Hi,
I've found a few things that can help.
1) If you notice it get stuck, you can delete the schedule and recreate the schedule, then it will start again on that new schedule.
2) be very careful with the code you write, ensure everything is in a try/except, particularly when reading/writing to ArcGIS Online feature layers. I've even put those read/writes inside a try/catch wrapped in a while loop so it can retry a few times if it fails, as I have the feeling that ArcGIS Online feature layers sometime return a 50x error and that somehow crashes the kernel if not caught by your code.
3) write logging for everything you do to a file (still to do, I'm currently writing log info to an ArcGIS Online hosted feature table, but that likely compounds the issue if it is ArcGIS Online feature layer related). Then we can perhaps work out what's going on.
thanks @RobertAkroyd1 i recreated the task and it worked twice and failed once and i noticed on my other Notebook thats setup to run every 15 minutes its failed more times than its ran yesterday, in fact its not even tried to run since yesterday afternoon
i guess there are some serious issues going on at the moment, maybe its all linked to the other issues i have been having around not being able to export surveys from Survey123, it all seems to have got worse in the last two weeks since the AWS issue
thanks again
Stu