Picking NAS / SAN for a HA setup

2980
5
Jump to solution
05-10-2021 07:34 PM
raja-gs
New Contributor III

Hello,

I'm setting up a Highly Available ArcGIS Enterprise 10.8 on-prem on Windows VMs. As recommended by ESRI, for storing portal content (Portal), config & server directories (ArcGIS Server) and backup repo (ArcGIS Datastore), we'll be using shared storage

https://enterprise.arcgis.com/en/portal/10.8/administer/windows/configure-highly-available-system.ht...

I have couple of questions reg this shared storage

1. Per below article, ESRI recommends NAS or SAN. Apart from cost, what factors should be considered while choosing between NAS or SAN? Does NAS provide high random I/O - which seems to critical for arcgis components?

https://enterprise.arcgis.com/en/server/latest/deploy/windows/choosing-a-nas-device.htm

2. Are there any minimum recommendations or benchmarks to be considered - random read/write I/O, sequential I/O etc.?

Please add any other suggestions on this topic based on your experience!

0 Kudos
1 Solution

Accepted Solutions
Todd_Metzler
Occasional Contributor III

The "final answer" between NAS and SAN depends on the perspective of the person providing the information.  My perspective is GIS performance while my organization SAN administrators are interested in most benefit to most user base.  Since most of our organization is in the knowledge worker category, our SAN is optimized for those workflows.  Enterprise GIS requirements may differ from from other workflows.  GIS data access tends to be more read intensive than write.  As one example, our SAN is configured in the middle of the road for read and write performance compromise for our knowledge workers.  That detracts from SAN performance for GIS.

Esri recommended NAS to us.  If you have confidence in your organization for a GIS optimized SAN then  performance between the NAS and SAN should be about equal, provided the network paths provide equal performance.  If your SAN is a compromise for many users, as in our case, then suggest going with the NAS.

Other tidbits:  We store our Mosiac Data Set (MDS) source tiles on SAN.  Recently our SAN solution was changed from Dell/EMC to Cohesity.  The Virtual IP management, firmware and connection logic in the Cohesity SAN environment required a number of patches to get ArcGIS Enterprise to reliably access those data stored on the SAN.  So another plug for NAS.  Unless you want to compete with all other SAN users in your organization and compromise to make GIS work.

last tidbit.  So why am I not following Esri recommendation and moving my largest data sets to NAS from SAN you might wonder.  Resource limited.  A NAS set up (2 NAS really for redundancy and fail over backup) costs money and optimized for GIS Enterprise makes the NAS resources a "stand alone" storage solution that doesn't benefit other users in the organization.

Good Luck!

Todd

View solution in original post

5 Replies
Todd_Metzler
Occasional Contributor III

Our Environment:  Windows Server 2012R2 x 2 physical machines, 24 cpu cores each machine, 192GB RAM, ArcGIS Enterprise 10.8.1 base deployment.

We've had our ArcGIS Enterprise in HA before.  It provided LESS availability than non-HA.  We have reverted back to non-HA.  Our HA performance inhibitor was probable with high confidence related to unreliable throughput and inconsistent latency on our WAN between our ArcGIS Enterprise and our Enterprise SAN.   If you decide to go HA, make sure your environment includes no compromise, short path high quality network components in the same data center to a SAN or NAS.

After consultation with Esri we are considering trying HA again but only after increasing our ArcGIS Enterprise resource footprint to include (1) two VM or physical machine web servers so we can move the web adaptors off of our ArcGIS physical servers; (2) the addition of dedicated low latency NAS on our own sub-net or (3) direct attached storage to each of our physical machines.

To put things in perspective, our organization includes about 13,000 employees.  Our WAN and SAN resources are shared among, mostly knowledge workers, in our environment.  If you can assure your ArcGIS Enterprise deployment has completly dedicated resources, from end-end then HA just might work out well.  If you are constrained, like I am, by available resources,  probably better to steer clear of HA and deploy more than one ArcGIS Enterprise base deployment with some type of replication schedule.

Todd

raja-gs
New Contributor III

Thank you, Todd. Appreciate the detailed answer. Dedicated NAS is definitely something to discuss with our infra team. And, I'm planning to exclude web adapters since we'll use SAML and Load balancer for routing traffic. Right now our focus is on HA since we really need the scalability factor as well. Hope it works out!

What I noticed in general is many users are using NAS for shared storage. Since I'm not an network expert, I'd love to get details on why NAS is preferred. Is the because of better random IOPS? is it preferred for file server and cheaper? 

I'm also hoping to get some recommendations/benchmarks on NAS specs from ESRI. May be only the ESRI business consulting team will help in that regard.

0 Kudos
Todd_Metzler
Occasional Contributor III

The "final answer" between NAS and SAN depends on the perspective of the person providing the information.  My perspective is GIS performance while my organization SAN administrators are interested in most benefit to most user base.  Since most of our organization is in the knowledge worker category, our SAN is optimized for those workflows.  Enterprise GIS requirements may differ from from other workflows.  GIS data access tends to be more read intensive than write.  As one example, our SAN is configured in the middle of the road for read and write performance compromise for our knowledge workers.  That detracts from SAN performance for GIS.

Esri recommended NAS to us.  If you have confidence in your organization for a GIS optimized SAN then  performance between the NAS and SAN should be about equal, provided the network paths provide equal performance.  If your SAN is a compromise for many users, as in our case, then suggest going with the NAS.

Other tidbits:  We store our Mosiac Data Set (MDS) source tiles on SAN.  Recently our SAN solution was changed from Dell/EMC to Cohesity.  The Virtual IP management, firmware and connection logic in the Cohesity SAN environment required a number of patches to get ArcGIS Enterprise to reliably access those data stored on the SAN.  So another plug for NAS.  Unless you want to compete with all other SAN users in your organization and compromise to make GIS work.

last tidbit.  So why am I not following Esri recommendation and moving my largest data sets to NAS from SAN you might wonder.  Resource limited.  A NAS set up (2 NAS really for redundancy and fail over backup) costs money and optimized for GIS Enterprise makes the NAS resources a "stand alone" storage solution that doesn't benefit other users in the organization.

Good Luck!

Todd

raja-gs
New Contributor III

It's good to know that ESRI recommended NAS to you! After discussion with our network team, we decided to go with our enterprise NAS which is clustered. Given the high performance promised by them, we're confident going forward with our setup. Only time will tell if the setup functions without major headaches. 

Appreciate you detailing your experience!

0 Kudos
Todd_Metzler
Occasional Contributor III

P.S. Check ArcGIS Enterprise deployment guide.  Probable with high confidence you'll need at least one Web Adaptor for Portal component for full Portal functionality.

0 Kudos