Here is where it rained today in Australia, a bit around Perth, a bunch in Queensland and for the sharp-eyed a little at Cape York to keep the prawns and crocs happy. The legend shows mm precipitation.
The Australian Bureau of Meteorology publishes downloadable weather data; my scenario is I'm interested in republishing rain gauge observations to a hosted feature service, which is in the map above. After a little digging around I find the data is available via FTP with a schema described in this user guide. The data is refreshed daily. While I'm not redistributing the data in this blog I'll mention it is licensed under Creative Commons terms so you can implement my sample if you wish.
While the refresh rate is daily each file can contain observations spanning more than 24 hours and from multiple sensors at a site. Anyway, what I wanted was the daily observations pushed into a feature service in my portal; I could just as easily send the data to ArcGIS Online.
These periodic synchronizations from the web are everywhere in GIS. Data custodians make it easy to manually obtain data, I'm going to show you how simple it is to automate synchronizations with Data Interoperability. I usually describe Data Interoperability as Esri's 'no-code' app integration technology. Full disclosure, in this sample I did resort to some Python in the FME Workbenches I created, so I have to back off the no-code claim, but I can say its low-code. You can see for yourself in the Workbenches.
I started out thinking I would make the scheduled process a web tool and schedule it with Notebook Server. That might be the most fun to build but I realized it just isn't called for in my use case. I fell back to a pattern I previously blogged about, namely using Windows Task Scheduler, but this time on a server. Why not use the desktop software approach? Well, to take advantage of a machine with likely very high uptime I know can be scheduled outside my normal working hours.
Here is the Workbench that does the job of downloading the BoM product files and sending features to my portal's hosted feature service:
And I can't resist it, here is the Python, not too scary. It would be unnecessary if the filenames at the FTP site were stable, I could have used an FTPCaller transformer, but they have datestamps as part of their name, it was just easier to handle that with some Python. While I was downloading the data I also cleaned it up a little (removing enclosing double quotes and newline characters) then sent all observations out into the stream.
As the data sources don't change I made all parameters private, this simplifies the command to schedule. All that I need to do is get the Workbench onto the server and make sure it runs. In the post download you'll find three FMW files:
RefreshRainGauges - Server Copy.fmw
MakeRainGauges creates a geodatabase feature class I used in Pro to instantiate my hosted feature service. RefreshRainGauges is built from MakeRainGauges and only differs in that it writes to the feature service, with initial truncation. That's the Workbench I want to schedule. RefreshRainGauges - Server Copy only differs from RefreshRainGauges in its Python settings, to use Python 3.6+. I didn't use that name on the server, just to get it into the post download.
On my portal server there was a little setup (I have a single machine with everything on on it, don't forget to install and license Data Interoperability!). RefreshRainGauges uses a web connection to my portal. In this blog I describe how to create a portal web connection. This has to be copied to the server for the arcgis user which will run the scheduled process. The simplest way is method #2 in this article. Logged onto the server as arcgis, I first created a desktop shortcut to "C:\Program Files\ESRI\Data Interoperability\Data Interoperability AO11\fmeworkbench.exe", started Workbench, then imported the web connection XML file and tested reading the feature service. I also copied RefreshRainGauges to a folder and edited it to adjust the Python environment to suit the server (the sample was built with Pro 2.7 Beta but the server is running Enterprise 10.8.1). Running the workspace interactively, the top of the log reveals the command to be scheduled:
The rest is simple, just create a basic task, the hardest part was figuring out what time to execute the command (I see data changing as late as 1AM UTC so I went with 6PM local time on my server, which is in US West). Make sure the task will run if arcgis is not logged on, and the arcgis user will need batch job rights (or be an administrator, which I can do on my VM machine but you likely will not be allowed to do).
That's it, automated maintenance of data I can share to anyone! I finish with a screen shot of the processing log from last night's synchronization:
Automating integrations is a powerful force multiplier and we no-code 'citizen integrators' are not to be left out! If you have read Esri's recent blog about hosted feature service webhooks and are wondering how to get started, read on. My goal is to show you an implementation pattern that avoids the complexities of handling webhook JSON payloads and lets you concentrate on your business processes without stepping outside of the ArcGIS platform. Your webhook will trigger an ArcGIS Data Interoperability web tool (i.e. a geoprocessing service) on your Enterprise portal or standalone server which executes your integration on any schedule.
In an earlier post I covered the details of ArcGIS Enterprise10.8.1 or later (required for this workflow) setup for publishing ETL web tools (embed the FMW source!), and also in this post covered how to share web connections, which you'll need for this pattern. For web connection sharing, the export/import option is simpler.
Webhooks are incoming HTTP POST traffic so your server/portal must allow incoming traffic. For my purposes I used an Amazon EC2 instance with standalone server installed. If your server is federated to a portal you'll need to set up cross origin resource sharing (CORS), specifying the storage hive used by your hosted feature service. That is one of the https://services(n).arcgis.com entries you see here.
If CORS isn't already set up on your server, download and install the CORS module for IIS from here then open IIS Manager and with the server Configuration Editor navigate to system.webServer/cors and set enabled to True.
If you want to get fancy with host origin permissions then how to edit web.config is described here.
The web tool processing approach requires edit tracking timestamp data, so in addition to the webhook creation requirements of enabling editing and create/update tracking, editor tracking is also required, so your feature service properties must have these three settings enabled. Enabling sync is theoretically OK but I haven't tested webhooks triggered by syncs.
Now you are set up.
My scenario is I have a hot feature service with a large number of edits per minute (I tested 2000). My 'integration' is a simple write to a separatehosted feature service table and completes in a few seconds.
My 'integration' workspace is pictured below, its also in the blog download as the ETL tool WebhookProcessing. This pattern does not derive anactual change set by parsing the webhook payload JSON (which a geoprocessing service can't see anyway), changed features are found using a WHERE clause on the feature service reader, for example with my webhook frequency and being generous about allowing for process latency to go back further in time than I expect the webhook fired I used:
EditDate >= CURRENT_TIMESTAMP - INTERVAL '60' SECOND
I read my integration target with a similar where clause, why I'm doing this is explained next.
EditedDatetime >= CURRENT_TIMESTAMP - INTERVAL '60' SECOND
I don't want to double-handle features. The FeatureMerger transformer's UnmergedRequestor output connection will only send on features with recent or not previously integrated status. You'll have your own business logic to avoid double-handling by consecutive webhooks in a hot system, say with some flag field that indicates if a feature has been integrated. In most cases the webhook period (minutes? hours?) will be much longer than the processing time so you should be able to avoid any overlaps.
The 'integration' workspace:
In the blog download you'll see the ETL tool TriggerWebhook, which does what it says and causes trigger events, in my case edits and updates.
I manually ran TriggerWebhook and WebhookProcessing consecutively to simulate live action. Note that an integration workspace can have no parameters because the webhook cannot deliver them. I published the WebhookProcessing history item to my server, the resulting synchronous service has an execute endpoint that is my webhook URL:
The geoprocessing service is visible from my server connection:
Don't forget your server connection must be Administrator level:
If your server is federated to a portal the web tool will be visible in the Portal contents. In either case the webhook URL will be the execute endpoint, for example (will be broken):
After creating the webhook using the admin API URL I have my webhook active:
Now when I run TriggerWebhook the geoprocessing service fires and my 'integration' happens. I took a peek at the server jobs directory to watch the action, jobs arrive when the webhook fires:
In my case the 'integration' is just calculating the time between the trigger event for each feature and when it completes processing. Successive runs for my 1000 edits produced varying LatencySeconds values depending where my edits were with respect to the webhook clock.
So there you have it, use a webhook event to fetch features that triggered the webhook and do something with them! Have fun with your integrations.
Webhooks. If you don't code you might think webhook integrations are scary or at least need extra middleware apart from ArcGIS. Wrong on both counts. All citizen integrators who can put together an FME workbench with Data Interoperability extension, and have the extension for both Pro 2.6and Enterprise 10.8.1 can automate webhook-driven integrations between Survey123 surveys and anything else Data Interoperability can reach.
I'll outline the scenario in case you haven't looked into this automation pattern before. Survey123 is a form-driven data capture app that manages a feature service where each survey is a point, line or area feature with attribution you design. The forms have rich behavior and support web and mobile devices. There are unlimited possibilities for downstream processing of survey features. What this blog will cover is this data flow:
A Survey123 survey submission embodies a transaction for its feature service (add, update)
A webhook defined for the transaction type is triggered at each submission
The webhook URL is an execute call for a Spatial ETL web tool (geoprocessing service)
The web tool performs an integration using the survey submission feature(s)
Putting this together is all about configuration, so lets get started. I'm going to describe the most common platform combination - Survey123 is hosted in ArcGIS Online and your integration server is an ArcGIS Enterprise portal - but if you have Survey123 on your enterprise infrastructure that will work too.
Because you will be creating a web tool that reads and writes a feature service you will need to share a web connection to your server as described in this blog, so your first step is to make sure the service owner account (by default named 'arcgis') on your server can start Workbench as described and use a web connection to where Survey123 feature services live. Don't proceed past this point until you have successfully shared a web tool that exercises the web connection, such as the test tool described in the blog.
I'm going to jump ahead now to explain how to avoid what might present as a silent failure of your webhook. Survey123 (i.e. https://survey123.arcgis.com) will be calling your server with a POST request, so it needs to be trusted. Below is a screen grab of my browser while I submit a survey, with the debugger pane open (Firefox). I have highlighted the POST request sent to my server - it worked. If your server rejects a request from Survey123 you'll see error messages talking about CORS - cross origin resource sharing.
For me this setting wasn't enough, I also had to change HTTP response header settings in IIS. On your server, open IIS at the HTTP Response Header control, double click to open it and Add what you see below.
Now the server is set up lets get to configuring the data flow. If you have sharp eyes you'll see my test survey is about a caviar sandwich. If you research the coordinates its entirely possible you could actually buy one there! My real point though is Survey123 is sending a bunch of JSON, and it can be very complex for a big survey. If you take a look at other integration platforms you'll find that while they try and look like no-code approaches you might be forced to parse JSON in mid stream to get the data you want. However, Survey123 is always writing to a feature service, which of course Data Interoperability natively understands, so we have the luxury of forgetting about the JSON flying around and going directly to the feature service. Fantastic!
First though, create your survey. Here is mine:
When a survey is submitted a feature is written to its feature service and then any configured webhook is triggered - you can see the applyEdits POST in my browser debug view above, ahead of the webhook POST. To support the integration geoprocessing service finding new records, my survey contains a hidden, required question 'Integrated' with a default value of 'N', which my integration edits to 'Y'. I created my survey in the web designer, which doesn't support hidden questions, then I edited it in Survey123 Connect to add the question, here is how it looks when edited:
And when saved to the survey:
You have to update and republish the survey to apply the schema change.
So now my survey has a field I can use as a processing selection flag. I created a couple of records so I could design the integration workspace, and here it is:
Its also in the blog download. It doesn't do any actual integration - that's your job - but you can see the pattern. Read records where integrate = 'N', do your stuff, then update integrate to 'Y' and write back to the service with GlobalID as the key field. Make sure your tool source is embedded.
If you have a large number of collaborators in your survey and by chance two surveys are submitted simultaneously it doesn't matter if one execution processes two surveys and another none. If you're worried about latency of surveys writing to the feature service then begin your process with a Creator-Decelerator-FeatureReader combination.
I manually tested the processing before publishing a web tool by submitting surveys and running the tool in edit mode. At that point I'm ready to share the web tool. Run the tool from the toolbox to create a History item then share it. You'll notice the service is to be public, if that can't be done at your shop then take Survey123 onto your enterprise infrastructure.
Make the tool synchronous for a start. If you find your available instances can't keep up move to asynchronous.
My tool has no parameters.
When it completes sharing, go to the service URL and copy the value, here is mine:
Talk about a stealth enhancement, I thought this would take a while but it arrived this morning. The recently published ArcGISOnlineConnector now handles on-premise portals. Now you can automate portal item management with Spatial ETL tools or FME. Fantastic!
Here is how to set it up. First create a portal connection app with the workflow described in this blog. Record the App ID and App Secret.
Then open Workbench and the FME Tools dialog at the Web Connections view. At bottom right you'll see the Manage Services button:
Click on Manage Services, you'll see this dialog.
Click on the + pulldown at bottom left and choose Create From...
Then Esri ArcGIS Portal (Template)...
Then fill in the details for your portal and you'll have a new connection type.
Authenticate a connection of the new type and you'll have a web connection you can use with ArcGISOnlineConnector to connect with your portal.
ArcGISPortalConnector is an alias for ArcGISOnlineConnector in Workbench. Enjoy!
This post is about leveraging ArcGIS Enterprise 10.8.1 with ArcGIS Data Interoperability extension to share Spatial ETL web tools within your organization. The 'advanced' aspect is we'll be moving data using a web connection to a web platform enabled by an FME package, plus we'll throw in a webhook notification step for fun.
In my on-demand technical workshop for UC 2020 I demonstrated how to create a web tool that performs a format conversion by web service, a classic use case for Data Interoperability combined with the sharing power of ArcGIS Enterprise. Data Interoperability however has moved on from being a format foundry to being a generic solution for no-code integration across the web, so in this post I will show how to build a web tool that takes non-native data through to a web platform. I happen to be reading Microsoft Accesstable XY event data and writing to ArcGIS Online hosted features, but read between the lines and understand the data could go to any cloud platform or app equally easily. In some ways moving data to the web is a better fit for Spatial ETL tools as FME readers and writers work with containers or workspaces, for example a geodatabase, which is not a valid geoprocessing service output parameter that can be returned to a client. It can be done, for example by zipping file geodatabase outputs, but if you have a web platform available, then use it, data is more useful on the web than as local files.
Web tools are a new name for geoprocessing services. The original workflow for publishing geoprocessing services using a history item and an administrative connection to a standalone server is still available in ArcGIS Pro 2.6, but I'll show the new paradigm of sharing a web tool to a portal. You share a Spatial ETL tool as a web tool from a history item just like any other tool but with two special considerations:
The Spatial ETL source FME workspace must be embedded, and not external to the tool
Input parameters should be defined as simple types like Filename (Existing), Text or Choice
The first condition helps with packaging the tool at publication time, the second helps keep the final published tool parameter behavior simple. You may have to refactor your ETL tool at authoring time to achieve these simplifications. For example here I'm showing before and after versions of a Microsoft Access input parameter. The default configuration allows picking multiple files with a large choice of file type extensions, my desired behavior is a single file with only .mdb and .accdb extensions.
Default Access Database Reader Parameter
Modified Access Database Reader Parameter
Conceptually and graphically my ETL tool is simple. My scenario is I'm working for Fako Mining who do borehole planning with an Access app and record borehole locations as latitude & longitude in a table. Click to expand the screen capture below, or edit the tool extracted from the blog attachment. The data flow is:
Read an Access database table that has XY value fields
Write (or replace) a hosted point feature layer in ArcGIS Online
Trigger a webhook that notifies a Microsoft Teams channel the new data is available
There are three bookmarks in the workspace, in the blue one any existing feature service is deleted (I'm creating services named after the processing date), in the green one the Access data is read and converted to a point feature class in a zipped file geodatabase, and in the tan one the file geodatabase is published and shared to a group and lastly the webhook sent. The concept is Online members are members of a Teams channel and will be notified the data is available. While you're inspecting the workspace have a think about how much code this would take if you scripted it...
The key function in this is the ArcGISOnlineConnector transformer that does the work of publishing and sharing the zipped file geodatabase. It is an FME Hub download and needs to be provisioned on the server, plus it uses a web connection that also needs to be provisioned on the server.
First, set up your server. Install and license ArcGIS Data Interoperability for Server, make sure your server is federated to your portal, generate and import certificates for portal and server and so on. Do not installArcGIS Data Interoperability for Pro on your server, you will get license and tool failures
Once you have Data Interoperability installed, log onto the server as the ArcGIS Service local account user (by default named 'arcgis') and confirm you can open the Workbench app at this path:
While logged onto the server as arcgis, create a share with the below path and give full control to the domain user (presumably you) who will be publishing the web tool and owns the web connection used in the ArcGISOnlineConnector.
From the authoring machine and as the publishing author, open the share. Here is how it looks on my laptop:
Never mind the files in your share are different to mine, they will soon align.
Now we're going to provision the FME Package for ArcGISOnlineConnector and the web connection used by it to the server. Download the current release of the package to the share. Still from the authoring machine, open Workbench and then the FME Tools dialog Default Paths view. Note (write down!) the profile directory for Data Path, it is the folder where web connections are stored. You will temporarily change it next.
One way web and database connections are shared is the owner copies the files fme_connections.data and fme_publicKey.jceks from the Data Path to a shared folder (in our case the share above, which is also the default location where the arcgis user will store web connections), redirects the Data Path to use the new location, opens the Web Connections view of FME Tools and selects web connections to be made public. Having done that revert the Data Path to the original default for the authoring user. The arcgis server user must have write access to the copied data. If your server is not on your network it may be easier to use the export/import methodology for credential sharing.
Now as the arcgis user logged onto the server, open Workbench from the path given above and drag the file safe.esri-agol-1.1.4.fpkg (the version may change) from the share at C:\Users\arcgis\AppData\Roaming\Safe Software\FME into the canvas to install it. Check FME Tools>Web Connections sees the web connection to ArcGIS Online. Create a test workspace that makes a web connection and uses an ArcGISOnlineConnector to list the contents of a contents folder - the blog download has an example TestWebConnection.fmw. If everything works the server is ready for action!
Run your Spatial ETL tool as a tool so that a history item is created. Then right click on the history item and share it as a web tool. Here is my experience:
Allow uploads, make it asynchronous, set the Info message level, this will display workspace processing messages.
Don't forget to edit the tool parameter(s) - click the pen icon. You may have to do this twice to make it 'stick'.
It will analyze with a warning, this is normal for Spatial ETL tools.
After the web tool publishes, open the Catalog pane at the portal contents and run your tool. Mine throws a warning to do with some coordinate system handling, but I'll take the win. By design there are no features displaying as output as there is no writer in the workspace, my data went to ArcGIS Online courtesy of the ArcGISOnlineConnector. The log shows intermediate processing worked as expected.
Teams shows me the webhook triggered, which is the last part of the web tool processing, and it gives me a service URL so something is there.... Take a look at what is in the HTTPCaller transformer to see how I got an image into the webhook payload (its a public image item in Online).
Bingo, my data loads into my map!
So that is the round trip, I have a web tool anyone in my organization can use to publish hosted feature layers from an Access database table, with the added touch of automated notifications.
Layers : 0 (or whichever layer you want to download)
returnAttachments : false
async : true
syncModel : none
DataFormat : GPKG
Then 'create Replica" will initiate the creation the GeoPackage download file and provide a Status URL at the bottom of the page.
Click the statusURL, and at this Status URL, once the processing is complete, you will receive the "Results URL" as per below.
Finally, click the Results URL and the file will download locally. Then open it up wherever GeoPackages are supported.
From ArcGIS Pro, just Add Data and navigate to the downloaded file.
Of course, given these are API calls, these steps can be set up programmatically as part of a regular ETL workflow, e.g. scheduling an ArcGIS Notebook to run and use the ArcGIS API for Python or scheduling a job to run using a workbench created in the no-code ArcGIS Data Interoperability Extension and use the HTTPCaller transformer.
Here's the download Geopackage example REST API call:
If the item is public just subsitute the relevant partsof your item's Service URL. If the item is private you'll also need to generate and append a 'token' at the end of this string.
If you are not a developer, this URL won't work by just putting it in a browser because it requires an HTTP POST method, and browsers follow GET method by default, so you'll have to use some tool like Postman.
The 2020 Esri User Conference has many sessions on the ArcGIS Open Platform. We want to make it easy for you to work successfully in a heterogeneous environment. To create an open and interoperable system, Esri has adopted a multifaceted approach, including support of: standards; direct integrations with non-GIS technology; direct read & write of hundreds of data formats; open developers’ tools; ETL tools; metadata; open source; open data sharing; and SDI.
Please visit us at the
Open Platform: Standards and Interoperability Esri Expo Area
Data Interoperability (or FME) users whose organizations secure access to ArcGIS Online using Okta need to do a little work to make FME Workbench web connections to their Online organization.
Okta is one of many identity providers available for securing ArcGIS Online sign-on. You can check Okta and other options at this site. If you attempt the usual web connection creation experience however you'll get rejected at the very last step with an error from OAuth2, this post is about making it work.
Authenticate using the Okta option, then go to your Content pane and add an App item.
Choose 'An application'.
Make sure you select Application app type and give it a name and some tags:
Then go to the item Settings tab, scroll down to the Registered Info details and choose to Update the details, and add a Redirect URI with the value http://localhost. Your details will look like this (App ID will differ):
Copy the App ID value and App Secret values into a text editor. You need to Show Secret to expose the value.
Now start Workbench from the Analysis ribbon in ArcGIS Pro and under the Tools menu select FME Options and activate Web Connections:
Click on Manage Services at bottom right and when the dialog appears click the small pull-down in the Add Web Service control and choose Create From > Esri ArcGIS Online.
Give the service a meaningful name and description, enter the App ID retrieved earlier into the Client ID value and enter the App Secret retrieved earlier into the Client Secret. Replace the Redirect URI value with http://localhost.
Use the Test control to bring up the authentication dialog and choose the Okta option:
My org uses multi-factor authentication so I am invited to send a push (you may not see this):
Then save your changes and Close the Manage Web Services dialog:
Now back in the FME Options > Web Connections dialog click the Add Connection control (the + sign bottom left)...
...and your shiny new Okta authenticated web service option for ArcGIS Online is available to make a connection:
Scroll down to the service name, authenticate and you're done!