Automate "Update Data" -> "Overwrite entire feature layer" via notebooks or arcpy?

jzcgis · ‎09-25-2023

Is it possible to schedule an update data for a feature layer that is sourced from a dropbox location? This file gets updated daily (same file, just the number of rows may change).

DornMooreICF · ‎09-25-2023

I've just set one up that is reading a geojson file from a URL. But you could do the same thing with a CSV or Excel if you needed. It was pretty easy for me to do this but I'm not sure how easy via dropbox, especially if it has to authenticate before it gets to the file. If the file is public.

I'll assume the file you are sharing is in a public folder and is something Esri knows how to read, say a CSV.

If that's the case and when you open the link, it directly opens the CSV you should be able to use it.

I'll attach a smaple code below which assumes you've already created a feature layer from a previous version of the file and you know the service ID. I'ts pretty simple. Once you've got it running, you can set it as a task to run on your schedule

jcarlson · ‎09-26-2023

I'll chime in to say that you should consider developing a different method for updating your layer. Unless the schema is changing, an overwrite is overkill, and increases the chances that a layer becomes broken in some way. It's like bulldozing and rebuilding a house just to repaint the walls.

The simplest thing is to truncate and append. Just use the FeatureLayerManager to truncate (https://developers.arcgis.com/python/api-reference/arcgis.features.managers.html#featurelayermanager) like FeatureLayer.manager.truncate() and then load in the data however you prefer. There is the Append method, which works well with files added to your Portal, but there is also just using the Edit Features method to add all your features that way.

A bit more advanced, but here is the method we use:

Query both the source and destination to separate dataframes
Identify and apply adds: features in source, but not destination
Identify and apply deletes: features in destination, but not source
Use pandas.DataFrame.compare on remaining features present in both source and destination to identify rows/attributes that have changed.
Use the resulting comparison to build a JSON object, submit to feature service endpoint

It may not be worth the trouble for your particular case, but consider: we have a layer of about 60,000 parcels that get updated every night. On any given day, though, only about 100 parcels are actually edited, maybe a handful added or deleted. And within the updates, only one or two of the fields themselves are actually edited.

By identifying edits and selectively applying them to the layer in place, the process cuts the amount of data being transferred by several orders of magnitude. We're talking single kilobytes of JSON compared to dozens of megabytes of a shapefile or what have you. Additionally, there is never a period in which the layer is unavailable or empty, meaning edits can be applied at any time of day or night without interrupting your users' access to the data.

- Josh Carlson
Kendall County GIS

DornMooreICF · ‎09-26-2023

Thanks for this. I agree. Although I'm dealing with a small number of records, this feels like a better solution.