Optimal number of Service Instances?

943
6
09-05-2018 02:58 PM
AllenScully
Occasional Contributor II

We have an interesting/frustrating case happening- - likely will result in a support call but wanted to see if more experienced minds have any thoughts.

We have 2 feature classes updated every minute via GeoEvent (Fire Dept Unit and Call location/status).  These feature classes are then published in a standard map service and used in a webmap > ops dashboard.  The layers are in the map multiple times, with different filters/symbology for use in the dashboard.   The refresh interval for these layers is set at 1 minute - so, this is a lot of calls to the service.  Multiply this by the # of users viewing the dashboard, and things get tricky  (up to 20 users at a time - testing with 5 people).

We initially had the map service published on a general-use ArcServer machine, with the max instances bumped up to 15.  Caused all kinds of problems with the ArcServer service and server in general - typical behavior being the whole ArcServer instance seemed to crash - nothing available in maps, arc server manager not available on that machine - but the Windows ArcServer service never stopped, RAM/CPU never spiked, and we were sure that the max # instances total per arcserver installation was not exceeded (200, I believe is the functional cap) - server is a VM, 4 CPU and 16 GB.

Put the service on a brand new ArcServer VM - (2 CPU, 8 GB) - ramped up the max # instances to 30.  All is fine until we get 3+ people using the dashboard, then ArcServer goes haywire.  Never seeing all 30 instances 'in use' and the # running varies between 2 and 25 or so.  Only other service running is the Publishing service.  

The logs show a few items which don't point in any single direction (to my eyes at least):

SEVERE: MapServerObjectFactory failed to create an instance of MapServer.

SEVERE: Geodatabase error: Out of server memory  (however could still connect to the SDE database in question via ArcMap/Catalog)

WARNING: A connection with the server could not be established - WinINet Error while using HTTPS security, 12029)

     - this may be a symptom after the service starts to misbehave

Again - no overall CPU/RAM spiking or sitting at terribly high levels on the server.

So my question is, how do we handle this?  Even more instances of the service?  Fewer?  Other service properties to consider?  More power for the VM?  SQLServer connection issues? (database can't handle the traffic?)

using server v 10.4.1, SDE version 10.1 (I know, soon to be upgraded)

Thanks - sorry for the lengthy post

Allen

0 Kudos
6 Replies
AllenScully
Occasional Contributor II

Quick update - 

Took the max # instances down to 15 (from 30) - had a successful day of load-testing, with about 7-8 people interacting with the dashboard.  Occasionally there would be 15 instances in use according to Server Manager, but that was brief and the service did not crash.  

I suspected a service with 30 instances was not a great idea, but doing the calculations of #layers + refresh interval + #users, I thought it was worth a shot.

Still have many questions about the optimal configuration for this use case - real(ish) time data in a webmap with a large user base.  

0 Kudos
JonathanQuinn
Esri Frequent Contributor

What you may want to do is some performance benchmarking using System Test or JMeter.

https://www.arcgis.com/home/item.html?id=e8bac3559fd64352b799b6adf5721d81 

Apache JMeter - Apache JMeter™ 

That way, you can simulate load and requests per second to identify how your Server will response without affecting users.

What are your min instances set to? Another idea is to set your min instances to your max instances so you don't incur costs during service startup when lots of requests come through. this would mean that you're always consuming the equivalent of 15 running services at all times, though. I'd also suggest increasing the number of cores on the machine, if your license allows it. If you're dealing with performance related issues, then decreasing the available resources won't help.

0 Kudos
Christopher28
Occasional Contributor

We have found in tests that it is better not to set the Min number too high. Apparently active but unused instances are put into some kind of sleep mode (the RAM consumption of the machine went down). If they are reactivated now, they crash more often according to the log.

 

In principle I see it similar and would rather use less Max instances to save the time until starting an instance because the usual requests are processed faster and therefore the waiting time is shorter.

 

I would also recommend evaluating the server statistics to see if they indicate problems with the services? Maybe....

0 Kudos
AllenScully
Occasional Contributor II

Thanks Christopher - 

That's helpful.  Mentioned above, but the min is set low - 2 - however most of the time I see all 15 are running, at least during regular 7am - 5pm work day hours.  I almost never see the # running go below 15 even though I would expect it to drop at least some just based on usage patterns.  

We have ArcGIS Monitor, which is a handy tool I'm still learning - it monitors both ArcServer performance/traffic as well as overall server use (CPU and Memory) - not showing much stress on the server as a whole, but the specific Map Service does get up to all instances in use at times.  

So far the 2 min/15 max configuration has not outright crashed, but there are times when performance lags noticeably. 

0 Kudos
Christopher28
Occasional Contributor

Which processors are in use? So what is their effective performance in terms of SpecRate per Core? Maybe more powerful processors (not more cores but more GHz) would be an option? I like to compare customer servers with the information from here:

http://wiki.gis.com/wiki/index.php/Capacity_Planning_Tool_updates

(Hope the insertion of the link at this point is ok)

0 Kudos
MichaelMiller2
Regular Contributor

I've been told via ESRI Architecture class, 3-5 instances per core for map services. From my experience this guideline works pretty well to provide stability and usability.

0 Kudos