Hi Community, I’m planning a multi-machine ArcGIS Enterprise deployment and want to ensure high availability. What are the recommended best practices for configuring Portal, Server, and Data Stores across multiple machines? Any lessons learned from real-world setups?
Solved! Go to Solution.
You can find more information at below link.
https://www.esri.com/arcgis-blog/products/arcgis-enterprise/administration/migrate-to-a-new-machine-...
Thank you for quick response. Will check it.
For high availability, best practice is to run 2+ Portal machines (with sticky sessions), 2+ ArcGIS Server machines in a site, Relational Data Store in primary/standby, STBDS with 3+ nodes, and Object Store for scene layers. Put everything behind a load balancer with CA-signed certs, use fast resilient storage for shared dirs, and schedule regular webgisdr backups. Test upgrades in lower environments and monitor with ArcGIS Monitor for performance and health.
Regards,
Venkat
Could you please share me any diagram of deployment architecture.
You can find more information at below link.
https://www.esri.com/arcgis-blog/products/arcgis-enterprise/administration/migrate-to-a-new-machine-...
Thank you for quick response. Will check it.
Lessons learned? Since you didn't say where exactly you are deploying, e.g., on-prem or in a cloud, it is hard to provide specifics. What I would say is make sure you really need high-availability before deploying it. What I have seen numerous times are "high-availability" deployments that have multiple single points of failure so they are high availability in name but not in reality. Doing truly highly available deployments comes with a non-trivial cost.
Echo both @JoshuaBixby and @VenkataKondepati . There are many variables that begin at the infrastructure level. Know your needs and ensure your infrastructure can meet those needs first. Then, start here: ArcGIS Architecture Center . Here's another "good read": Architeching the ArcGIS System
I highly recommend reviewing this Esri resource: Highly available ArcGIS Enterprise deployment scenarios—Portal for ArcGIS | Documentation for ArcGIS...
It covers various highly available deployment scenarios and includes helpful diagrams of each.
Some lessons learned and common pitfalls seen in real-world deployments:
If I may add a note from our experience, ArcGIS Enterprise HA is node-level, not service-level.
In a two-machine setup, the load balancer may keep sending traffic to a node as long as it responds to ping/REST, even if one Map/Feature service on that node is broken (hung SOC, bad datasource, etc.). Users then see intermittent failures and the platform still looks “healthy”.The fix isn’t more HA, it’s better health checks: add synthetic per-service probes that validate real output, not just liveness. For example:
• Map service: …/MapServer/export?bbox=…&size=600,400&f=image (check HTTP 200 and a non-empty image)
• Feature service: …/FeatureServer/0/query?where=1%3D1&returnCountOnly=true&f=json (check HTTP 200 and a sensible count/JSON)
If a probe fails N times in M minutes, drain/disable that backend in the LB and alert Ops (ArcGIS Monitor or LB custom probes).HA keeps the site up, but only per-service/per-node checks protect the user experience when a single service goes bad...