Q: What is ArcGIS Data Interoperability?
A: It’s a product of course, an extension for desktop and enterprise, but as a technology its Esri’s no-code integration solution. ‘Integration’ meaning connecting ArcGIS to data – for read or write – across a huge variety of formats, feeds and repositories, then manipulating the data as required for your business needs. If you can access it then Data Interoperability can probably work with it how you want.
Q: That’s a pretty bold claim - if you can ‘connect’ you can probably ‘use’ – what kind of data sources are we really talking about?
A: Historically it was all about file-based formats like CAD and some proprietary formats used by early GIS software, plus anything text-based – files on disk. Database connectivity was added – network transport. Then an era of XML arrived, and raster and point cloud. FTP and HTTP connectivity were added to reach the internet, then JSON so ‘data in motion’ became tractable, and lately connectors to cloud stores. As products have evolved from desktop through network to web the technology has adapted to ‘just work’. This all snowballs of course, nothing goes away, and new sources of all types are added as they emerge.
Q: Is Data Interoperability the same as FME?
A: It is except when it isn’t. You’re obviously aware it is FME technology, Safe Software build Data Interoperability for us, but the two desktop products differ slightly in what functionality is included in a licensed item, and the server products are different. We collaborate with Safe very closely to make sure functionality Esri users are going to need gets into FME technology and hence Data Interoperability. Many users have both products, you can share workspace files between them, and your skills in each are applicable in the other.
Q: If I connect to data can I just use it like a geodatabase item or feature service or other ArcGIS native source in my mapping and geoprocessing?
A: Yes, after writing it to where an Esri app can see it, but accessing raw data is only part of the story. It is rare that data is in exactly the schema you want for your work, and filtering and geometry remediation is commonly needed. This is where the hundreds of data transformation tools come in. Once you connect to data you diagram a stream-based workflow that implements the processing you need, and lastly write your result where you need it.
Q: You said: ‘diagram a stream-based workflow’. Is this with ModelBuilder?
A: Data Interoperability provides an app – Workbench – which delivers a visual programming environment like ModelBuilder. It is very easy to drive, you work on a canvas, add, configure and connect things called readers, transformers and writers, do partial or complete runs like ModelBuilder, and sessions can be persisted as geoprocessing tools, again like ModelBuilder. I want to stress that working in Workbench is like ModelBuilder or creating Python script tools in that you work in the ArcGIS geoprocessing environment. Like ModelBuilder but unlike Python script tools, you aren’t coding, you are diagramming.
Q: I’m handy with Python, does that help?
A: It might, but Python isn’t necessary. Data Interoperability is no-code technology, but code friendly too. Sometimes it saves some diagramming work to use a Python snippet for a function. This is another similarity with ModelBuilder, which has a model tool ‘Calculate Value’ that lets you apply a Python snippet. In both cases it can be a timesaver.
Q: Data Interoperability is ETL, but I see cloud vendors promoting ELT, do you have comments?
A: Extract Transform & Load (ETL) is where the transformation of data is done before sending it to a system of record, Extract Load & Transform (ELT) is where data is first sent (or already exists) in a system of record and is manipulated within that, for example by using SQL or using a view and/or a SQL mimic language in a JSON store technology. While Data Interoperability has first class manipulation capability, it also has query connectivity within cloud platforms, letting you work on the data where it is and not hauling it up and down. Don’t fight data gravity, do what works for you, Data Interoperability lets you do this from within ArcGIS as either ETL or ELT.
Q: What is the best learning pathway for Data Interoperability? Is this how you got started?
A: There are online courses at esri.com/training, and a wealth of information at safe.com for FME Desktop, which is almost completely applicable to Data Interoperability, but the most valuable learning pathway is to learn by doing – i.e. making ETL tools - and engaging with the ETL community on GeoNet and at knowledge.safe.com. I picked up Data Interoperability by being involved in migrations of CAD systems to ArcGIS.
Q: I have heard of Data Interop for Server. Can you explain ETL functionality in the server context?
A: The pattern for Data Interoperability for Server is within geoprocessing service publication, it is just a specific case of that. There are some details you need to think about, like how to handle ETL tools that output a workspace – such as a File Geodatabase – which can’t be done in core geoprocessing (you can zip the GDB and return a file parameter), and services should be asynchronous, but otherwise just treat an ETL tool like, say, a Python script tool.
Q: I have heard people use Data Interoperability for supporting standards-based formats and protocols, like GML or WFS or GeoPackage. Can you elaborate a bit on that?
A: Popular formats and protocols often eventually find their way into core ArcGIS, even if originally support for them was through Data Interoperability. However, it is usually the case that a much richer translation experience remains available in Data Interoperability. This is true even for de-facto standards like CSV, Excel and KML.
Q: Can you share any interesting scenarios where Data Interoperability has played an important role?
A: The most impactful outcomes – being subjective here, but the pattern is recurring and the number of users very large – is when a system of record can be harvested for authoritative data, then the data ameliorated and used to maintain a hosted feature service in ArcGIS Enterprise or Online, which of course delivers a performant data source to Esri apps. The source system might be FTP, HTTP, WFS, Protocol Buffer, a REST API endpoint – anything. The advantage of this pattern is you can automate data provision to Esri users with no downtime and without disrupting the system of record.
Q: I have heard of FME server. Do you support that?
A: I mentioned that the ArcGIS and FME server products are different – meaning separate – but they have functional overlap. Data Interoperability for ArcGIS Enterprise delivers web ETL in the geoprocessing service framework, a good pattern for doing things like scaling format translation work, or processing file-based data sources heavily used in your organization, for example you may process a lot of Excel files. FME Server can be used in the same way but wouldn’t be a good investment compared to licensing Data Interoperability for an existing ArcGIS Enterprise installation. However, FME Server has a rich trigger-action automation capability that can be used for complex integrations amongst multiple systems. If your work is about integrations that do not emit results to a client but perform synchronizations, then FME Server may be indicated. If installed alongside ArcGIS Enterprise, your integrations can include ArcGIS software like ArcPy or the ArcGIS Python API.
Q: What is in the roadmap for Data Interoperability?
A: There is always a long list of format and transformation functionality in the pipeline, but a couple of things stand out which are close to release so hopefully will be out by the time this Q&A goes out the door. The first is the maturing of tools to interact with big data. An example I intend to blog about is retrieving large compressed CSV data (that changes daily) from the web, then using GeoAnalytics Desktop to do some parallelized Spark processing in Pro, then sending the results as a Parquet file to a cloud platform where it can be queried by anyone; all as an ETL toolset. Automating this so it happens daily on a schedule will drive the point home. A second development we expect to be a crowd-pleaser is the ability to manage ArcGIS Online items within an ETL workflow, that means the upload, overwrite, download or deletion of the many Online item types you can create with Data Interoperability. This will close the loop on sharing ETL processing on Esri’s public cloud.
Q: Can you talk about more futures – not just Data Interoperability, but generally in the areas of ETL, app integration and such?
A: Nothing is going away, so file-based and network-based data sources will continue to exist, but for some time now and increasingly, data is moved around in web formats like JSON, and we are seeing trends in services with protocol buffer payloads, and formats optimized for moving big data around, like Parquet. Interacting with REST APIs is almost old hat now and made simple with Data Interoperability. You can take it for granted that formats or protocols that have traction in industry will be supported by Data Interoperability. Focus on your work and tell us what you need, the future will be built for you.
Q: Where can I get more information about Data Interoperability in particular, and ETL related topics in general?
A: Your Esri representative will be pleased to advise you on Data Interoperability licensing. ETL is a big topic and a well-documented industry discipline, but to really bootstrap your project consider a professional services engagement with your local Esri representative or partner. For background material, search on GeoNet within the Open Platform, Standards and Interoperability space.