Select to view content in your preferred language

Building a Data Driven Organization, Part #2: Go Distributed, Unlimited & Automated

1468
0
07-21-2021 12:40 PM
BruceHarold
Esri Regular Contributor
1 0 1,468

Distributed collaboration is a feature of Enterprise and Online that enables the sharing of edits to feature services and file items.  You may be like me and took in a recent session at Esri's User Conference, which got me thinking about expanding the net of environments which can share data updates the same way - namely automatically, on a schedule, between Portals and Online plus many more environments.  Data like these 311 service requests in Dallas, Texas which update hourly:

Dallas 311.png

The same business drivers discussed in the UC session that indicate using ArcGIS' distributed collaboration may also apply for data which isn't in a Portal or Online, particularly if data isn't shared to or from the ArcGIS system.  You can still implement automated collaboration using ArcGIS Data Interoperability.  You get a no-code experience just like distributed collaboration.

I'm going to show a pattern using Data Interoperability extension with processes authored in Pro and using Enterprise as the compute and scheduling resource.  If you have a Pro machine with high uptime you could use just the Pro machine.

What is in scope for automated collaboration?  Hundreds of formats and systems.  What data velocity is reasonable to automate?  Like core distributed collaboration, this approach isn't suited for strictly real-time or very large edit-feature counts.  Event driven real-time integrations are better handled with webhooks (see my earlier post) and big data editing should be centralized outright in one organizational group.  My example reaches into a Socrata instance once an hour and synchronizes a few thousand features.  I would be comfortable with a synchronizing a million features once a day.

The goal here is sharing data without downtime and giving your audience a persistent connection to use in maps and apps, this translates into portal or Online items that retain their item identifier, with the underlying data being efficiently replaced.  Here is a file geodatabase item plus a hosted feature service item that are both simultaneously refreshed each hour.  How?  Here is the processing workspace:

2021-07-21_8-00-31.jpg

 The blog download has this workspace and another that I used to create data for the target items.  I'll walk you through this one (RefreshDallas311).  The stuff on the left culminating in the green bookmark reads the 311 feed from the web and writes a zipped file geodatabase which is then used to overwrite the file geodatabase item.  A star of the show here is the ArcGISOnlineConnector (which also works with Portals).

BruceHarold_0-1626879996315.png

The other star of the show is on the right, the ChangeDetector, which calculates which features are new, changed, unchanged or deleted and sends them to the feature service writer with an appropriate fme_db_operation format attribute to set the transaction type per record (ObjectIDs are joined for updates and deletes, if the feature service is edit tracked or already in a distributed collaboration you would join GlobalID).

BruceHarold_1-1626880065876.png

Processing is simple!  Now lets cover how I got the process onto my server and scheduled.

I'm not publishing a web tool, I scheduled an executable on the server, namely C:\Program Files\ESRI\Data Interoperability\Data Interoperability AO11\fme.exe with RefreshDallas311.fmw as the argument to the exe.  The process will require any credentials used to be copied to the server and any local paths to be correct.  In this case the workspace also requires a reader (Socrata) and transformer (ArcGISOnlineConnector) be installed from FME Hub.

Firstly, as the arcgis service account owner (i.e. a user that exists) on the server I created a folder somewhere for this stuff to live, C:\Users\arcgis\Desktop\DistributedUnlimited.  I created a shortcut to "C:\Program Files\ESRI\Data Interoperability\Data Interoperability AO11\fmeworkbench.exe" on the desktop so I could start Workbench conveniently.

Credentials I needed to copy are web connections for my accounts at ArcGIS Online, Socrata and GMail.  To copy credentials you open the Workbench app from the Analysis ribbon in Pro and then the Tools>FME Options control, go to the Web Connections tab and access the grid of available connections, mine are:

BruceHarold_0-1626892519427.png

Right click on each one needed to be migrated and export to an XML file.  I copied them to the folder created on the server.  Then I copied RefreshDallas311.fmw into the server folder.  I opened Workbench (no workspace) and imported the XML file web connections, manually checking they work.  I then opened RefreshDallas311.fmw, the Socrata and ArcGISOnlineConnector packages auto-installed.  I checked the FeatureWriter local path to the zipped file geodatabase is valid, namely in the server folder.  I ran the workspace, I had to change my Python compatibility to Python 3.7+.  At the top of the log file was:

Command-line to run this workspace:

"C:\Program Files\ESRI\Data Interoperability\Data Interoperability AO11\fme.exe" C:\Users\arcgis\Desktop\DistributedUnlimited\RefreshDallas311.fmw

Now, for the arcgis user, I created a scheduled task  using the command and argument, running hourly.

BruceHarold_0-1626895957319.png

 

I'm done, I have automated a distributed, unlimited collaboration!

The blog download has my ETL tool (Pro 2.8).  My server is Enterprise 10.9.  Don't forget you need ArcGIS Data Interoperability installed and licensed at both ends!

Thanks to Dallas, TX.