Making ArcGIS Hub downloads resilient and reliable

1636
0
11-06-2019 11:08 AM
AndrewTurner
Esri Contributor
8 0 1,636

Recent ArcGIS Hub issues with reliable downloads

Since we launched ArcGIS Hub as Open Data in 2014, organizations and communities around the world have come to rely on Hub as their platform for sharing authoritative data. We continually add and expand features for improving the accessibility and utility of these data by anyone. In addition to these features, we know that reliable downloads are a requirement that must be maintained.

Over the past few months, there have been several instances of limited download availability. We apologize for those unforeseen issues and are diligently working to improve our platform, processes, and communications to provide you with the confidence and in your choice of using ArcGIS Hub as your authoritative data sharing system.

To support your work ensuring that your servers are effectively configured and available I want to share important details with you about how the Hub data sharing and download system works, what recently occurred, and what we are doing to improve the system.

How Hub Downloads work

ArcGIS Hub provides an integrated, easy to use, and automatic system for sharing data in a variety of common, open formats. The ArcGIS platform is comprised of individual servers hosted both on-premise and in cloud services - each with slightly different server versions, infrastructure configurations, and a great variety of attribute and geometry complexity.

Our goal is to provide automatic, up-to-date downloads, regardless of these variations in backend services. To accomplish this, the Hub technology uses ArcGIS Online Items and Groups that create a metadata registry of content. When you specify that data should be made available for download, ArcGIS Hub creates a cache of the data by crawling the server and publishing the various file formats. When you first register a service, we pre-cache the data so that it’s available immediately for data users. When someone requests the download, we verify that the currently cached version is up-to-date with the server and if so we send them the file. If the data is out of date, we generate a new download file for them on demand. We also added periodic daily checking of data caches against your servers and preemptively update these caches for the next users.

Sometimes there are issues when creating downloads from these servers. There are many reasons, from system outages of on-premise servers, to misconfigured layers that cause queries to fail. We provide administrative error messages and warnings and you will soon see new and more actionable versions of these messages in Hub. We are also developing more options for you to schedule these cache updates based on your publishing processes.

Recent download issues

Along with making sure data you share is readily available, we also need to ensure that data is no longer accessible once it is no longer shared. Recently we changed how we remove cached download files. Unfortunately, the logic for removing these invalid download caches sometimes also removed valid download caches. An error in the application logic caused valid download caches to be deleted.

We receive system alerts immediately, 24 hours a day, 7 days a week. However, by the time the issue was identified and resolved, we needed to recreate the download files for many datasets.

Plan for more reliable and trusted downloads

Our goal is to provide you and your users confidence in accessing and downloading data. To reinforce this, we are making several changes immediately to how we build, deliver, and report on downloads in Hub.

We have organized a development team focused on integrated systems operation and product capabilities that include downloads. This team of engineers is responsible for both the logic creating downloads as well as the infrastructure for providing the downloads are comprehensive and integrated.

Another useful tool we will provide you is a public dashboard of the current and historic status of Hub downloads and as well as other systems. This dashboard will provide a view for you to understand the current operating conditions as well as any potential issues and their expected resolutions. We are working with the ArcGIS Online team to also incorporate this information into the Online status page.

This new status dashboard will be an improved real-time and streamlined measurement of download operations that improve & resolve emerging issues more quickly. A summary of these alerts will be published to the dashboard so you can stay updated with current status. Additionally, we plan to integrate necessary alerts into the Hub user interface so that data users and admins can also be alerted to potential download issues as they are attempting to access data. At first, this will include system issues, but we also intend to allow you to share the current status of your own shared systems through these alerts. As stated before, sometimes downloads are unavailable due to your server infrastructure or external issues.

To support your work ensuring that your servers are effectively configured and available, we are investigating tools that you will be able to use for your own configuration review and monitoring. Publishing open data is straightforward but does require some consideration on how to effectively scale and maintain the underlying authoritative services.

As always, we appreciate any feedback or ideas.

Thank you,

Andrew

Tags (1)
About the Author
CTO, Esri R&D DC. Systems Architect for ArcGIS Hub I joined Esri in 2012 as the CTO from GeoIQ. We built GeoCommons and are integrating the web capabilities, standards, analytics, and open data into the ArcGIS Platform. I wrote "Introduction to Neogeography" for O'Reilly in 2006, highlighting the rise in ability for anyone to make personal maps telling stories of their lives, investigate location history and answer complex questions. In 2010 I co-founded CrisisCommons, a global network of volunteer technologists to rapidly assist in a crisis through data creation, tool development and technology assistance. I am a chart member of the OSGeo Foundation, a member of the Humanitarian OpenStreetMap Team and the OpenStreetMap foundation. In may 'spare' time I brew beer, play trombone, and practice 14th century german longsword.