Select to view content in your preferred language

Building a Data Driven Organization, Part #12: Simplify Sharing Data Interoperability ETL

1839
9
02-04-2022 11:00 AM
BruceHarold
Esri Regular Contributor
3 9 1,839

If your team's job requires repeated custom data ingest then you'll be interested in making it a one-step process - the proverbial Easy Button.

The best way to share an ETL integration is to make the destination dataset a feature service, they are ideal for sharing, and of course you can make web tools in Enterprise.  However, sometimes you cannot use those options; you need a simple way to ingest a dataset just for you or your colleagues, or like-minded people who are not in your organization or even not known to you.  You just want anyone doing the same job to make data the same way.  Your data may be sensitive or time bound, or you may be off any network.  You just want data loaded into a geodatabase in one step.

A couple of teams here at Esri came to me recently with this problem and in both cases the solution was the same - ArcGIS Data Interoperability custom formats.  Data Interoperability comes with an 'easy button' called the Quick Import geoprocessing tool.  This tool lets you pick any non-raster format from its gallery of 500 or so and convert the data to file geodatabase.  To add new formats all you need to do is create them! Relax, its a no-code experience.  Then you can share them with anyone.  In my colleagues' cases they were dealing with special 'dialects' of XML and CSV that needed specific logic to create features correctly.  Custom formats are built on top of existing formats (e.g. XML, CSV and hundreds more) and encapsulate all the logic to create feature classes or tables however you need.

Data like the current US weather can be queried at a location:

National Weather Service ReportNational Weather Service Report

 

Anyone can do this with Quick Import if the weather site response is made a custom format.  I did this, so the Quick Import input dataset format gallery lets me pick US Point Location Weather Forecast format and set its parameters (any address, location or POI in the US will work, under the covers it uses the World Geocode Service):

US Weather FormatUS Weather Format

 

Weather Format PropertiesWeather Format Properties

 

Then a Forecast feature is available to map and query - the green hexagon is a weather feature:

Boston Weather FeatureBoston Weather Feature

Spoiler
This is not relevant to the story of custom formats, but the feature is a hexagon because I used an Uber H3 encoding of the XY value geocoded from the input.  This complies with terms for non-storage use of the geocode service - all I know is the geocode XY is somewhere inside the hexagon.  The hexagon size (i.e. H3 index scale) also agrees with the coverage NOAA's NWS service returns for a point forecast.

 So how do you make a custom format?  All that is required is just a normal standalone Workbench document (.fmw) that satisfies two conditions:

  • The workspace must have an input dataset
  • The workspace must write to a file geodatabase

Once you have configured your workspace and verified it creates data correctly, the File menu in Workbench has a choice Export as Custom Format.  This saves the workspace as a format definition to a default location in your profile directory with a short name and long description you can see and search on in the formats gallery.  The format definition has the file extension '.fds' but it remains editable with Workbench.  These fds files can be shared with anyone either by copying into their profile directory or putting on a file share they can see - the Tools>FME Options>Default Paths dialog allows you to specify Shared FME Folders where your team can share fds files and other resources like credentials.

Here is my profile directory with a few custom formats in it:

C:\Users\<username>\Documents\FME\FormatsC:\Users\<username>\Documents\FME\Formats

Two of these files are in the blog download, NWSFORECAST.fds and CASOLAR.fds - the file names come from the short name you give the format at export time.  If you copy them into your C:\Users\<username>\Documents\FME\Formats folder (make it if it isn't there already) you'll have two new formats!

NWSFORECAST is the weather forecast format and CASOLAR is California Distributed Generation Interconnected Project Sites data. This is a tabular monthly digest of electrical generation installations (that are not off grid) in California, it has a lot of fields describing generation projects (mostly solar).

Here is the data at writing summarized as total system kilowatts per ZIP code and extruded as 1Kw = 1m:

California Generation Kw November 2021California Generation Kw November 2021

If you want your own copy look for this format:

California Generation FormatCalifornia Generation Format

I chose to make these example custom formats from underlying XML for the weather format and CSV for the power generation format, partly because XML and CSV (along with JSON) are frequently encountered but also because in this case they are delivered via HTTP, not a dataset reader, and I wanted to show you how to do that.  To meet the condition that a custom format workspace must have an input dataset I used a special format called NULL, which as the name suggests does nothing.

If you edit the fds files you'll see nothing special.

Weather format workbenchWeather format workbench

 

Solar projects workbenchSolar projects workbench

If you want to test drive the custom formats you will require ArcGIS Data Interoperability extension installed for Pro.  I used Pro 2.9 to create the formats and have not tested earlier releases.  Be aware the power generation data will take about 40 minutes to download and process (nearly 1.3M rows), and you will need to find California ZIP code boundaries to map the data.  The weather report format takes a few seconds.

My real message here is that one person with Data Interoperability (or FME) skills can create custom formats for big button ingest and share the functionality with anyone using the extension.  The person using the format only needs to know about the Quick Import tool and does not need Data Interoperability training.

9 Comments