High availability environments for ArcGIS are becoming engrained within the critical business operations and workflows of your organization. Defining a SLA, service level agreement, will identify your organizations percentage of required service up-time and help guide you to designing a HA solution that satisfies your organizations expectations.
Our spotlight presentation, "Considerations for a Highly Available Enterprise", at Esri's 2018 User's Conference identified the below approaches to consider while designing a Highly Available system.
Redundancy can be accomplished through duplication and load balancing. Duplication of instances reduce the number of single points of failure while load balancing is a technique for distributing client workload traffic requests across multiple system components.
System Operational Plans
Test Plans should be applied on the systems and all applications that feed into those systems. These tests plans should not be a onetime task and done. They need to be part of a predefined schedule. Please test the apps and systems prior to going live and at a predetermined schedule. Having these test plans in place and recording the test results, will help you keep tab of your systems over its life cycle. Operational plans can include, but not limited to: Stress Testing, Performance Testing, and Testing of Fail-over functions and activities.
Prevention is certainly better than the cure, it applies to systems too! Monitoring system health to identify and proactively address problems are key to maintaining a highly available system. System monitoring tools are available from various sources, including Esri. The more systems you have to manage, the greater the need for a monitoring tool. Use the monitoring tool to monitor: CPU usage, Memory usage, Response time, Service throughput, etc. Ensure you can configure them to execute a job, like notifying you when a system status crosses a threshold.
The approaches listed above, are just some of the strategies that are meant to minimize service downtime. Implementing these recommended approaches along with your own organizations strategies will enable maximum up-time and provide a reliable, high performing ArcGIS environment.
Keeping these best practices in mind, you can implement these approaches in your highly available enterprise. Here is a download to the PDF for this presentation from the 2018 User's Conference: Considerations for High Availability