portal HA 10.5 issue

AhmadAwada1 · ‎01-22-2017

In a portal high availability 10.5 scenario, I noticed that when a standby portal machine detects failure in the primary machine, it communicates with the failed primary machine and drops (the standby machine i mean) its "c:\arcgisportal\db" directory and create a new one (may be based on info from the failed primary db folder)!

I noticed that IP communication between the two machines shall continue to exist during fail-over! In my case, if primary machine is shutdown, the standby machine would not startup unless there is access again to the failed machine (even if portal service is down).

Conclusion:

Is portal high availability is only on the service level? That is, if both machines are up (network wise) but one of the portal services is down, then fail-over will take place, while on the other hand, if network access to the primary failed machine is lost, then standby machine will not start?

What is wrong with my configuration?

Here is the error i get when standby machine has no network access to the failed primary machine:

"The portal has been initialized and configured but is not accessible. The internal portal database does not appear to be running or accepting connections. Restart the portal machine or machines and if the problem persists, contact Esri technical support (U.S.) or your distributor (customers outside the U.S.).</Msg>"

JonathanQuinn · ‎06-26-2018

The only way for Portal to failover is by stopping the service. Up until 10.6, this can take time, as you experience. At 10.6.1, the failover time has decreased to under 30 seconds due to some re-architecting of the logic.

Is there a reason why one should be the primary? It shouldn't matter whether portal1 is the primary or portal2 is the primary.

Todd_Metzler · ‎06-27-2018

In our enterprise we have two physical servers. prd1 and prd2. Both prd1 and prd2 host ArcGIS Enterprise server and now portal in HA config. prd1 is also the portal and server web adaptor host. We plan to deploy ArcGIS Enterprise Data Store to prd2. I'd like to limit the loading on prd2 from portal by keeping prd1 the primary portal as much as possible. If you'd confirm that in the HA portal config the loading on each server is the same whether the individual portal is functioning as primary or secondary then, I agree it doesn't matter which portal is primary and which is secondary.

JonathanQuinn · ‎06-27-2018

Right, it doesn't matter which is primary or standby. Both receive requests through the web adaptor or load balancer in front of them, but both machines will access the database on the primary. Data is automatically streamed to the standby so in the event of a failure on the primary, the standby can be promoted with no data loss. When the primary returns from the failure, it'll return as standby and data will be streamed again from primary to standby.

Todd_Metzler · ‎06-28-2018

Jonathan,

Thank you for the information.

Todd