The ETL workflow winner at UC 2025? Hosted Feature Services!

BruceHarold · ‎07-25-2025

At writing it's the week after the 2025 Esri International User Conference, and in holding up the ArcGIS Data Interoperability and ETL Patterns topics in the Esri showcase all week I can report that most interest for inbound data flow was around maintaining hosted feature services from external data. This post will show you a technique I have blogged about before but with an important update, how to most efficiently do bulk change detection, so read on.

Here's my subject matter data, street addresses for the City of Los Angeles, who kindly make the data available at their open data site. The address data is maintained daily.

Beverly Hills isn't in Los Angeles

Data integration strategies include a variety of approaches, and there are generally multiple "right answers", a good overview can be had by watching a presentation by my Esri colleagues earlier this year. In the case of my subject matter data, all the options discussed in the presentation (ArcGIS ModelBuilder, python scripts python notebooks, ArcGIS Data Pipelines and ArcGIS Data Interoperability) are all valid approaches.

But this is the Data Interoperability community space, so with focus on that option I have a processing tip for you. You'll hear in the presentation linked above that maintaining a hosted feature service is very common, and minimizing downtime while doing so is important. One way is to maintain two feature services, edit one at a time and do a source swap between them. If using ArcGIS Data Interoperability however, you have another option available, the ChangeDetector transformer, which can very quickly derive a transaction that includes only added, updated or deleted records, almost always a very efficient transaction.

The trick though is getting the existing and new bulk data to the transformer so it can calculate the delta transaction! This is how I recommend doing that:

Using a web connection to drive change

The ETL workspace FMW source is in the post download.

There is no escaping retrieving new incoming data by download, but you can avoid streaming in the current state of the target feature service (which is comparatively slow) by asking the portal to generate a file geodatabase export, and downloading that. The export job call requires a token, which is difficult to generate if your org is like mine and enforces multi-factor authentication (MFA). The trick is to use web service authentication in the HTTPCaller transformers that initiate the export job and check job status in the custom looping transformer that waits for export job completion. Web service authentication supplies a token behind the scenes so you don't have to supply it as an HTTP URL parameter. That is the central trick behind this approach.

You'll see from the workspace image above that a daily change transaction for Los Angeles' addresses is a tiny transaction at the end of a few minutes taken reading the data. The log file (not shown) tells me 67 features took part in the transaction, which took 0.5s.