How to gracefully exit or end a scheduled notebook run early?

JustinMillsFWS · ‎11-27-2023

Is there a graceful way to exit a scheduled notebook run early (i.e. without raising an exception that looks like a failure)? If we use raise SystemExit() or sys.exit(), it raises an exception which is detected as a failed notebook run. This will cause the scheduled task to be disabled. We've also tried a few other methods (like https://stackoverflow.com/a/56953105) with the same result.

What we want to do is check a feature layer for new records, then skip the rest of the notebook (the next dozen cells or so) if there are no new records. There's no point in re-processing this data if we've already ingested/transformed/exported it as that will waste credits and API calls.

We could wrap every single cell in an if statement that skips it if there's no new data, but that seems really clunky.

JustinMillsFWS · ‎11-28-2023

Here's how I solved it:

Instead of trying to break in the middle of the notebook, I put the check for updated data in a separate notebook and scheduled that notebook to run every hour. If there is updated data, this "controller" notebook uses arcgis.notebook.execute_notebook() to run the notebook that processes the data.

One thing I really like about this approach is that I can use a standard Python kernel to do the checking, then spin up a more expensive advanced kernel only if it's necessary. It also means I have fewer notebooks to schedule because I can use one notebook to call multiple other notebooks.

In version 2.0 I'm going to use some parameter passing to simplify maintenance, break up my target notebooks to be a little more granular, and add some keywork/tag searching so I can have the controller script find target feature layers automatically instead of hard-coding item IDs. There are also some cool things that seem possible with job management, but the Python API docs for working with them on AGOL are inscrutable and there are no examples to start from.

View solution in original post

JustinMillsFWS · ‎11-28-2023

Here's how I solved it:

Instead of trying to break in the middle of the notebook, I put the check for updated data in a separate notebook and scheduled that notebook to run every hour. If there is updated data, this "controller" notebook uses arcgis.notebook.execute_notebook() to run the notebook that processes the data.

One thing I really like about this approach is that I can use a standard Python kernel to do the checking, then spin up a more expensive advanced kernel only if it's necessary. It also means I have fewer notebooks to schedule because I can use one notebook to call multiple other notebooks.

In version 2.0 I'm going to use some parameter passing to simplify maintenance, break up my target notebooks to be a little more granular, and add some keywork/tag searching so I can have the controller script find target feature layers automatically instead of hard-coding item IDs. There are also some cool things that seem possible with job management, but the Python API docs for working with them on AGOL are inscrutable and there are no examples to start from.