bszukalski-esristaff

The day the internet died... And how to monitor it

Blog Post created by bszukalski-esristaff Employee on Mar 6, 2017

Some called it "the day the internet died" when a massive failure at a key Amazon east coast facility caused major disruption of some sites, including ArcGIS Online, for several hours.

 

According to Amazon:

The Amazon Simple Storage Service (S3) team was debugging an issue causing the S3 billing system to progress more slowly than expected. At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process.

 

Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.

For some it was an inconvenience, for others a cause for re-thinking their cloud backup plan. ArcGIS Online, including license activation for ArcGIS Pro and other premium apps, was impacted for several hours. But Amazon rectified the problem, everything quickly came back online again.

 

How to monitor ArcGIS Online system health

 

The ArcGIS Online Health Dashboard publishes the latest information on service availability. Here's how it looked during the Amazon outage:

 

Hovering over any icon provides the latest status and information.

 

 

While the event was unusual, you may want to subscribe to the RSS feeds to be notified of any issues. Subscribe to any individual RSS feed, or subscribe to All, to be notified of any service interruptions.

 

 

More information on service status as well as other service and security considerations can be found at Trust ArcGIS.

Outcomes