Select to view content in your preferred language

ArcGIS Server 11.4 CPU Usage

409
11
3 weeks ago
MatthewRyan2
Regular Contributor

Hey All, 

I have a problem with my brand new ArcGIS Server 11.4 installation. (fresh Windows Server 2019 machine)

I've installed all my new APRX which were previously MXD. Then we've starting using it as we did before and discovered that "normal usage" will occasionally max out the CPU and then the server will stay maxed out even after the users have logged off and gone home. The ArcGIS Server becomes unresponsive in all respects at this point.

Even when we manage the SoCs and move some into shared instances where it's appropriate, there is still a chance that this issue will occur.

On the topic of ArcGIS Server/Enterprise maxing our CPU and crashing, it seems to be a common occurrence. I've just read these two threads on it:

I have two issues with the above behaviours of the server and explanations found in those threads:

  1. This continues to occur after ALL USERS are off the system (ie. nobody's sending any requests to the server)
  2. This behaviour wasn't occurring with the ArcGIS .MXD runtime (The aprx runtime is completely different code)

#1 is the main concern, this can't be "too many SoCs" because there's nobody sending request to the server.  This are actually "runaway processes" which ArcGIS server has lost track of and they just keep churning away.

The description of "Runaway processes" as a cause to an outage would normally be considered as a bug in  software systems. I'm wondering why there hasn't been more of a spotlight here.

My questions for this are:

  • Has anyone else gone through the process with ESRI discussing these points?
  • Do we know if there is any consideration of it becoming a bug?
  • Are there any patches out that fix this problem?

 

NOTE: I've tested this with 10.9 and 11.3 as well. On Windows 2016 Servers and 2012 Servers. Upgrade in place and brand new machines.

Tags (2)
11 Replies
Joey_NL
Occasional Contributor

Very interesting that you say this! This matches our experience with 11.2 as well.

We have a lot of services which we migrated from a 10.6 server / ArcMap runtime and now are facing incredibly poor performance on a 11.2 server.

We have been trying with ESRI since last year to progress this as a bug, and the answer was just - More RAM, More Cores. 

I'd be incredibly interested in hearing if there is a bug or a patch for this. We've wasted so much time.

JoshuaBixby
MVP Esteemed Contributor

The organization I used to work for initially had instability problems with ArcGIS Enterprise 11.x until we analyzed memory utilization of the ArcGIS Enterprise processes and subprocesses.  It turns out the ArcGIS Pro runtime used in ArcGIS Enterprise 11.x results in ArcSOC.exe processes having about 50% increase in total memory footprint.  I say total memory footprint because it wasn't more RAM being used but more virtual address space by the processes.  It turns out we were hitting or nearly hitting the committed memory limit of the server, and the entire ArcGIS Enterprise framework would become unstable at that point.  Once it became unstable, the only option to stabilize it was to stop the ArcGIS Server Windows service and restart it.

I suggest you monitor the Committed memory usage of the server, and not just RAM.  If you are getting over 90% committed memory for any period of time, I would update the virtual memory (page file) settings on the server. 

berniejconnors
Frequent Contributor

@JoshuaBixby wrote:

The organization I used to work for initially had instability problems with ArcGIS Enterprise 11.x until we analyzed memory utilization of the ArcGIS Enterprise processes and subprocesses.  It turns out the ArcGIS Pro runtime used in ArcGIS Enterprise 11.x results in ArcSOC.exe processes having about 50% increase in total memory footprint.  I say total memory footprint because it wasn't more RAM being used but more virtual address space by the processes.  It turns out we were hitting or nearly hitting the committed memory limit of the server, and the entire ArcGIS Enterprise framework would become unstable at that point.  Once it became unstable, the only option to stabilize it was to stop the ArcGIS Server Windows service and restart it.

I suggest you monitor the Committed memory usage of the server, and not just RAM.  If you are getting over 90% committed memory for any period of time, I would update the virtual memory (page file) settings on the server. 


Just the other day I had one of our server admins tell me:

"With the latest versions of Windows Server we prefer to let the OS manage the size of the page file". 

That was a general comment on Windows Server 2022 - not specific to ArcGIS Enterprise.  I had asked the server admin to collect information on the paging file size from our 3 existing VMs running ArcGIS Server on older Windows Server 2016 VMs.

@JoshuaBixby is your recommendation to increase the page file size?

 

Bernie.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Memory configurations are very specific to machines and applications, so it is hard to give recommendations without knowing quite a bit more information.  While allowing the OS to wholly manage the virtual memory settings is convenient, and maybe sufficient for some groups/organizations given their situations, I doubt you see it being used in many enterprise IT operations.

What I can say at this point is that ArcGIS Server 11.x uses more memory than ArcGIS Server 10.x given the exact same services and settings.  Whether that increase requires more RAM or different virtual memory settings on a specific server depends on the amount of existing RAM and existing virtual memory settings.

I would encourage your IT folks to monitor various memory counters over time, simply pulling up Task Manager and taking a look a few times during the day is not sufficient.  A spike in traffic can cause a spin-up of more ArcSOC.exe processes which can lead to a momentary spike in memory usage.  If the spike is high enough, it can cause lasting instability in the ArcGIS Server framework processes.  Memory spikes can also be brief enough that casually looking at Task Manager throughout the day won't necessarily catch them.

berniejconnors
Frequent Contributor

Thanks Joshua.  That is great info for the planning of our 11.5 upgrade.

0 Kudos
berniejconnors
Frequent Contributor

@JoshuaBixby , Is there any documentation or warnings from Esri that ArcGIS Server 11.x requires more RAM to run the same number of services as ArcGIS Server 10.x?  This thread is the first time I have come across this information.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

I have never come across any documentation related to this subject, but I can't say I would expect to either.  Part of the trouble with sharing information related to this topic is the number of variables/factors involved.  It isn't as simple as "ArcSOC processes use 40% more virtual address space."  We saw some services use nearly the same, although probably 90% used noticeably more, but even then some were 10% more and others were 60% more. The average probably came out to be around 25%-30%.

Even without changes in software versions, changes in service configurations or service usage can significantly change memory utilization on the server.  The expectation should never be, set-it-and-forget-it.  Just like any other enterprise IT application, the OS and application should be constantly monitored to determine if resource or configuration changes are needed.

Joey_NL
Occasional Contributor

Interesting to hear! Feedback we've had on a support ticket was that ArcGIS GIS Server doesn't make use of page space, and that matches what we've seen in production. The server will run itself into the ground rather than use any of the swap file. Interesting to see that the experience with ArcGIS Enterprise seems to be different.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Although having insufficient paging space can cause memory issues, so too can having too little physical RAM.  The page file and physical RAM work together, and neither one can be a complete substitute for the other.  (I am dismissing the old "I have so much RAM I disabled my page file" argument because that is a terrible idea for many reasons, especially in production server environments.)  There is a limit to how much of a process address space can be paged, i.e., some portion of the process address space always has to remain in physical memory.  Therefore, someone can have a 10 TB page file and still crash an application or system due to too little physical memory. 

0 Kudos