The "Building a GIS" book states that the primary advantage of using a low isolation map service is to reduce the server platform memory footprint. The only trade off mentioned is the price of service stability (e.g. if one thread dies in a low isolation SOC, the whole SOC dies and with it all of its threads).
Esri's ArcGIS Server 9.3.1 web help suggests the chief advantage of low isolation is that one can support many more concurrent users given the same system resources:
[INDENT]Low isolation allows multiple instances of a service configuration to share a single process, thus allowing the execution of four concurrent, independent requests. This is often referred to as multi-threading. [/INDENT]
So my questions are:
Q1) Will converting my dynamic map services from High to Low Isolation allow me to serve 4X the number of concurrent users at the same response times (without having to upgrade our SOC hardware)?
Q2) Or will converting from High to Low Isolation only allow me to serve 4X the number of concurrent users but at significantly worse response times? (without having to upgrade our SOC hardware)?
Q3) Or will converting from High to Low Isolation only allow me to publish 4X the number of services (if physical memory is limited)?
Q4) How well do Low Isolation services work in practice? Esri mentions sacrificing service stability if you switch to Low Isolation, but a colleague believes the impact would be negligible.
Q5) Really confused about this one... in the same web help referenced above, *after* stating that a Low Isolation SOC process can support "four concurrent, independent requests", it goes on to say:
[INDENT]"With low isolation, a default of 8 instances and a maximum of 256 instances of the same service configuration can share a process" [/INDENT]
Is this in conflict with the prior "four" statement? Also, no guidance is provided for how one would determine how many instances to allocate to a single low isolation SOC (and 8 to 256 is quite a spread). Finally, the System Design Strategies Wiki mentions that a single processor core can support between 3 to 5 SOC processes. Wondering if this conflicts with any of the above info or if I'm simply confusing concepts.
Due to how poorly both Windows and Linux evidently handle context switches, I can see that it should be more efficient in principle to manage > 1 service instance per SOC process (Low Isolation services) vs. the same number of services published using High Isolation (1 service instance = 1 SOC process). But how far one can take this, I don't know. Anyone here done much experimentation with these settings? If so, what have you found to be true for your particular system?
"... Performance tests run by Esri indicate that, generally, the maximum number of simultaneously processing instances that can be supported on a CPU core varies by service type, but rarely exceeds 4. This means that even if the ArcGIS Server SOC machine is very powerful, it can only really support 4 times the CPU cores' service instances. For example, a machine with 16 cores and 32GB of RAM can support 4*16 = 64 simultaneously executing instances. This does not mean only 64 users, because users are not normally using a service instance simultaneously and repeatedly with no rest between requests.
This means that there is little reason to have so many service instances configured for even a very powerful machine. Instead, the ArcGIS Server administrator should configure the services to run with Minimum Instances equal to zero, Capacity on the machine set to a number less than 60 or 120 on Windows XP and Windows Server 2003, respectively, and allow the internal pool-shrinking algorithms to optimize instance availability. With minimum instances set to 0 and Capacity properly set, the ArcGIS Server can have many (hundreds) services efficiently running simultaneously without running into the Windows process limit described in this article. ..."
low isolation can improve ram consume but - you couldn't have good optimize pool-shrinking - generally slightly less performant than high isolation services - if fail a instance service go down other instance services in same process.
Thanks for the link Domenico. The KB article you referenced mentions XP and Windows 2003 Server having a problem managing more than 60 and 120 service instances respectively. Our SOCs run on Windows Server 2008, so at least we don't have the particular limitation addressed. ... > This means that there is little reason to have so many service instances configured for even a > very powerful machine. Instead, the ArcGIS Server administrator should configure the services to > run with Minimum Instances equal to zero, Capacity on the machine set to a number less than 60 > or 120 on Windows XP and Windows Server 2003, respectively, and allow the internal > pool-shrinking algorithms to optimize instance availability. With minimum instances set to 0 and > Capacity properly set, the ArcGIS Server can have many (hundreds) services efficiently running > simultaneously without running into the Windows process limit described in this article. ..."
Although it sounds interesting and I'll have to give it more thought, I see a few problems with the KB article author's approach:
1) Elsewhere, Esri says one of the worst things you can do is have SOC processes spin-up during peak load times. Evidently this can kill performance. The advice here doesn't account for that cost.
2) While the article recommends a service instance min of 0 (which may be appropriate for rarely used services), the advice is incomplete--they do not provide guidance on what the max value should be and thus make pool shrinking sound too good to be true. While I don't fully understand pool shrinking, I find it hard to believe it's ever appropriate (for high throughput systems) to "set it and forget it", e.g. to set all service instance mins to zero, set appropriate SOC capacities, then let an algorithm take over--unless min and max are now considered to be legacy parameters.
Does anyone here with a high throughput system and many dozens of services use the KB article-suggested approach? How well does it work under peak loads?