Select to view content in your preferred language

ETL Pattern: Scheduling Web Tools

1112
0
11-06-2024 11:43 AM
BruceHarold
Esri Regular Contributor
2 0 1,112

Often, ETL is not one-off, recurrence is needed to incorporate data change over time.  Until the ArcGIS Pro 3.4 and ArcGIS Enterprise 11.4 releases, the supported automation patterns for this included ArcGIS Data Pipelines and scheduled Notebooks or tools using the REST API task scheduler, but a new no-code option, web tool scheduling, is delivered in ArcGIS Pro 3.4 and ArcGIS Enterprise 11.4!

You can use any type of geoprocessing tool to create your scheduled web tool - core system tool, ModelBuilder tool, Python script tool or Spatial ETL tool.  For my blog subject matter I'm using an ArcGIS Data Interoperability Spatial ETL tool, because it can consume my source data, an RSS feed, specifically a Common Alerting Protocol (CAP) feed, as published by many agencies worldwide, including FEMA in the USA.  My CAP data is weather alerts in New Zealand, refreshed twice daily.  The feed will have no entries if the weather is good 😉.  CAP is XML-based, easily handled by ArcGIS Data Interoperability.  I want to mirror the CAP feed status to a hosted feature layer in ArcGIS Online.

Below is a sample alert status map in ArcGIS Pro, for October 9th 2024.  The blog download has a couple of CAP alert XML documents, if you want to see some raw data.

CAP Weather AlertsCAP Weather Alerts

As the labeling suggests, the yellow feature is a high wind watch, the blue lines (zoomed in, orange polygons) are snow warnings for roads through some mountain passes.  If we zoom in to the northernmost feature we can inspect it.  It is Lewis Pass, which has two geothermal spring resorts along the route, so if you do get delayed by snow you can wait it out in comfort!

Snow alert through Lewis PassSnow alert through Lewis Pass

A few days later, there is a heavy rain watch:

Fiordland rain watchFiordland rain watch

For the area, rain only comes in one type - heavy - so it's no surprise the prediction of an alert upgrade from watch to warning (orange) became true at the next update in 12 hours, plus some new alerts arrived:

West Coast rainWest Coast rain

And the next day - yet more weather!

Yet more weather!Yet more weather!

Regular updates like this are a classic case for a scheduled web tool, in fact that's how the data was refreshed for me overnight.  What does that ETL look like?

My data flow maintains a hosted feature layer in ArcGIS Online from the current CAP status.  My ETL tool is quite simple, here it is (also in the blog download, requires ArcGIS Data Interoperability for Pro 3.4, and ArcGIS Enterprise 11.4 if shared as a web tool).

CAP alert ETL toolCAP alert ETL tool

First a token is generated (using an EsriOnlineTokengetter, for a local portal you would use an EsriPortalTokenGetter), then the upper stream reads the RSS feed and writes an upsert transaction to the target feature layer - any new alerts become new features and any data changes to existing features are applied.  Upsert support requires the data have a uniquely indexed non-null field in the layer, as discussed in an earlier blog.  The lower stream tests for expired alerts and deletes them.  Note that the ETL tool has no parameters, because the input RSS feed and output feature layer do not change so can be set as not published at authoring time. 

I'm showing a new, recommended ETL pattern here for maintenance of hosted feature layers, namely generating a portal token within the ETL tool rather than sharing web connections to the hosting server, which is an awkward step we can avoid.  The target feature service is read and written using the Esri ArcGIS Server Feature Service format and a supplied token, with the option to verify SSL certificates unchecked.  If your security requirements require it, you'll need to supply a trusted certificate.

Having run my ETL tool locally in ArcGIS Pro I can share the history result as a web tool and schedule it.  There is some mental arithmetic to do when scheduling.  The CAP feed is updated by 9AM and 9PM "local time" which at writing is NZDT, or UTC+13.  My Pro machine is currently on PDT, which is UTC-7, so to start my schedule at the next available 12-hourly slot, 9AM NZDT, I calculate this is 8PM UTC or 1PM PDT.

Web tool schedulingWeb tool scheduling

In real life, meteorological offices allow for more frequent bulletins than my schedule above, but you get the idea.  At the link for the scheduled webtool you can pause, delete or edit the schedule.

The blog download has the tool source fmw file, some sample CAP XML files (which I used when authoring the ETL tool to get the XML parsing right), and a lyrx file that shows how I used data-driven rendering of hex codes in the data.  Not shown - creating an initial file geodatabase feature class used to instantiate the target feature service, but you can easily do this from the supplied workspace.  Don't forget the alert identifier constraints needed for upsert support!

So there you have it, configuring a scheduled web tool that will run indefinitely and maintain a hosted feature layer!

Now there is nothing stopping you from automating your ETL!