Today's featured transaction is upsert, ably assisted by delete. If you are maintaining a geodatabase or hosted feature layer (ArcGIS Online or ArcGIS Enterprise), unless you're doing something drastic like a truncate, you can package all your ETL edits up into these two methods using ArcGIS Data Interoperability (but also see core options below). Inserts and updates travel together as upserts, while deletes speak for themselves.
For example use case data, I'm using the City of Los Angeles Open Data Address Points. If you're keen to take a look at the raw material, this link will trigger a CSV file download (161MB+, 1M+ rows).
The map shows the result of ETL into a hosted feature layer, which is my target information product. It could also be any type of geodatabase. The raw data is typical of many ETL sources, with these properties:
#1 & #2 above are basic ETL challenges which are easily solved. #3 suggests that upserts and deletes are candidate methods, while #4 is what makes upserts possible. Using a match key is the secret sauce for this post, but like any good cooking show the recipe comes after a look at the result!
This single ETL tool supports two stages in the lifecycle of my target information product:
To create the feature service, the greyed-out writer labeled Creation would be enabled and the rest of the workspace would not exist. After service creation I disabled this writer and added the rest of the workspace. The workspace still reads the source CSV file at a public URL, but also reads the target feature service (which of course must already have been created), calculates the upsert and delete transactions, and applies them. It's a simple and powerful ETL tool but there is critical enabling work needed outside the ETL tool.
The REST API Append method that supports upsert operations requires the match key field (House_Number_ID in my case):
Meeting these conditions requires two simple processing steps outside the ETL tool before upsert processing will work. In my ETL tool, there is no way to set these properties, and there are two relevant factors: The Alter Field geoprocessing tool does not support changing the allow nulls property for a feature layer field, but the Add Attribute Index geoprocessing tool does allow creating a unique index on a feature layer field.
To work around the allow nulls problem I exported the initial feature layer to my project default geodatabase using the Export Features tool and in the field map control unset the allow nulls property for the House_Number_ID field.
With the output feature class in Pro, I then created a map layer, taking the opportunity to apply symbology other than the pink lemonade default, then overwrote my target feature layer.
With the overwritten target layer, I then added a unique index for my match key field:
Now the target layer is in shape for configuring upsert (and delete) processing!
The ChangeDetector transformer is what generates the upsert and delete change sets. Here are the settings:
Using match attributes:
Change detection comes at the expense of reading the target features in addition to the source features but does deliver an optimally small edit payload. For my subject matter data the whole job takes 5 minutes.
To reiterate, while I'm showing a hosted feature layer as my target information product, upsert write capability is also available for geodatabase features via ArcGIS Data Interoperability.
For completeness, I'll give a shout out to two other places upsert capability is offered in ArcGIS, namely the Append geoprocessing tool and Data Pipelines Add and Update output option. However, if you need to delete features as well as upsert them, then in core geoprocessing you'll need to use Delete Features or Delete Rows or in Data Pipelines replace the output feature layer. This pipeline does that job for the feature layer my ETL tool operates on.
I'll take the opportunity here to call out that using Data Pipelines to maintain hosted feature layers in ArcGIS Online that were created by separate ETL processes is perfectly valid. In my pipeline you'll see the Map Fields tool that greatly assists connecting the schema coming from a CSV file with how I defined the schema using Data Interoperability.
So there it is, upsert is there for you, ready to go!
The blog download has my Spatial ETL tool plus associated toolbox with the models.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.