ArcGIS Enterprise Fail-over testing

569
2
04-13-2020 05:42 PM
VishApte
Esri Contributor

Hi,

We have designed ArcGIS Enterprise 10.7.1 high availability solution with Web Adaptors and third-party load balancer. All servers are VMs. We need to document ArcGIS Enterprise 10.7.1 fail-over scenarios and test all these scenarios prior to go-live to confirm the fail-over happens correctly. I have struggled to find any documentation on all fail-over scenarios but here is a list of fail-over test cases: 

ArcGIS Portal Fail-over:

Stop ArcGIS Portal Service on a Primary node.

or

Shutdown server hosting Primary ArcGIS Portal.

or 

Take the Primary ArcGIS Portal machine off the network.

ArcGIS Server (2 machine site) Fail-over:

Stop ArcGIS Server Service on one of the servers

or

Shutdown one of the Server hosting ArcGIS Server

or

Take one of the machine off the network.

or

Disjoin a machine from a site.

ArcGIS Data Store Fail-over:

Put the machine with Primary Data Store off the network

or

Make secondary Data Store primary.

Are above scenarios correct for fail-over testing? Am I missing any scenario? There are some scenarios we cannot test e.g. disk crash.

Thanks,

Vish

2 Replies
JonathanQuinn
Esri Notable Contributor

Portal's failover revolves around standby not being able to reach primary, so any of the ways you described will do that.

There is no concept of "failover" for Server, since there are no roles.

In either case above, the machine or service being unavailable will indicate to your load balancer (as long as you have HTTP health checking configured) or the web adaptor that the component is unhealthy, and it won't send traffic to it.

For Data Store, yes, those scenarios will cause a failover.

VishApte
Esri Contributor

Thanks Jonathan Quinn