Recent ArcGIS Hub issues with reliable downloads
Since we launched ArcGIS Hub as Open Data in 2014, organizations and communities around the world have come to rely on Hub as their platform for sharing authoritative data. We continually add and expand features for improving the accessibility and utility of these data by anyone. In addition to these features, we know that reliable downloads are a requirement that must be maintained.
Over the past few months, there have been several instances of limited download availability. We apologize for those unforeseen issues and are diligently working to improve our platform, processes, and communications to provide you with the confidence and in your choice of using ArcGIS Hub as your authoritative data sharing system.
To support your work ensuring that your servers are effectively configured and available I want to share important details with you about how the Hub data sharing and download system works, what recently occurred, and what we are doing to improve the system.
How Hub Downloads work
ArcGIS Hub provides an integrated, easy to use, and automatic system for sharing data in a variety of common, open formats. The ArcGIS platform is comprised of individual servers hosted both on-premise and in cloud services - each with slightly different server versions, infrastructure configurations, and a great variety of attribute and geometry complexity.
Our goal is to provide automatic, up-to-date downloads, regardless of these variations in backend services. To accomplish this, the Hub technology uses ArcGIS Online Items and Groups that create a metadata registry of content. When you specify that data should be made available for download, ArcGIS Hub creates a cache of the data by crawling the server and publishing the various file formats. When you first register a service, we pre-cache the data so that it’s available immediately for data users. When someone requests the download, we verify that the currently cached version is up-to-date with the server and if so we send them the file. If the data is out of date, we generate a new download file for them on demand. We also added periodic daily checking of data caches against your servers and preemptively update these caches for the next users.
Sometimes there are issues when creating downloads from these servers. There are many reasons, from system outages of on-premise servers, to misconfigured layers that cause queries to fail. We provide administrative error messages and warnings and you will soon see new and more actionable versions of these messages in Hub. We are also developing more options for you to schedule these cache updates based on your publishing processes.
Recent download issues
Along with making sure data you share is readily available, we also need to ensure that data is no longer accessible once it is no longer shared. Recently we changed how we remove cached download files. Unfortunately, the logic for removing these invalid download caches sometimes also removed valid download caches. An error in the application logic caused valid download caches to be deleted.
We receive system alerts immediately, 24 hours a day, 7 days a week. However, by the time the issue was identified and resolved, we needed to recreate the download files for many datasets.
Plan for more reliable and trusted downloads
Our goal is to provide you and your users confidence in accessing and downloading data. To reinforce this, we are making several changes immediately to how we build, deliver, and report on downloads in Hub.
We have organized a development team focused on integrated systems operation and product capabilities that include downloads. This team of engineers is responsible for both the logic creating downloads as well as the infrastructure for providing the downloads are comprehensive and integrated.
Another useful tool we will provide you is a public dashboard of the current and historic status of Hub downloads and as well as other systems. This dashboard will provide a view for you to understand the current operating conditions as well as any potential issues and their expected resolutions. We are working with the ArcGIS Online team to also incorporate this information into the Online status page.
This new status dashboard will be an improved real-time and streamlined measurement of download operations that improve & resolve emerging issues more quickly. A summary of these alerts will be published to the dashboard so you can stay updated with current status. Additionally, we plan to integrate necessary alerts into the Hub user interface so that data users and admins can also be alerted to potential download issues as they are attempting to access data. At first, this will include system issues, but we also intend to allow you to share the current status of your own shared systems through these alerts. As stated before, sometimes downloads are unavailable due to your server infrastructure or external issues.
To support your work ensuring that your servers are effectively configured and available, we are investigating tools that you will be able to use for your own configuration review and monitoring. Publishing open data is straightforward but does require some consideration on how to effectively scale and maintain the underlying authoritative services.
As always, we appreciate any feedback or ideas.