Optimizing ArcSOC Availability and Utilization

2372
2
05-17-2023 10:29 AM
AaronLopez
Esri Contributor
11 2 2,372

ArcSOC Availability and Utilization

Optimizing the ArcSOC instance availability and utilization for your service is a good strategy for helping users obtain fast response times and lower wait times from their dynamic requests to your Site. It can also benefit server resource utilization like memory as the service is not running a lot of instances that it will never use.

But optimizing the minimum and maximum number of instances for your dedicated services is not a one-time job. Usage patterns of your services can change over time so the task of collecting this information is something that you will want and to revisit periodically as a GIS administrator.

Before diving into how to observe ArcSOC instance activity statistics, let’s review some of the key details of the two ArcSOC-based service types in ArcGIS Server and how they play into this discussion:

  • Dedicated
  • Shared

Note: Wait time is the duration the request spends in a “queue” on the server until an ArcSOC instance is available to start working on it.

Dedicated Instance Pool Services

Dedicated services (e.g., non-hosted and non-shared services) are a mainstay ArcGIS resource in deployments as many applications depend on the capabilities such as geoprocessing, Branch Version editing, and Utility Network workflows (all which require dedicated services).

While such services are very versatile and are a major pillar in ArcGIS because of the functionality they provide, as a GIS administrator you need to periodically examine, adjust, and configure the number of ArcSOC instances (minimum and maximum) to take full advantage of your available resources as your Site and users grow.

ArcGISServer_Manager_Dedicated_Instances1.png

To learn more about ArcSOC instances see: Understand service instances

Limitations of Shared Instance Pool Services

Shared service instances are great! They are truly a game changer for helping admins manage the demand for many services with finite resources. However, their restrictions and requirements limit which service capabilities can be used with them.  Geoprocessing, Branch Version editing, and Utility Network, for example, are currently not supported through shared services (or through hosted Services). This leaves dedicated services as the only choice for such functionality.

Configured ArcSOC Instance Availability vs Instance Demand

For services that are very popular, critical and/or have the requirement of running under the dedicated service type, understanding the optimal instance setting is important for several reasons. If the maximum number of active instances is too high, memory is wasted (as well as cost). The over allocation of ArcSOC instances is an important but unique case as its impact would not show up in the analysis of just response times.

Alternatively, having the maximum too low can impact performance (in the form of higher response times and longer wait times) as users might be frequently waiting for an already busy ArcSOC instance to become free.

Of course, setting the instance minimum and maximum to different values has a trade-off too. For critical services where performance is paramount, having the user’s request wait while an instance needs to be started can be time consuming and will affect performance. So, for predictable performance on essential services, it is recommended to set the minimum and maximum number of instances to the same value.

As listed on Introduction to service instances:

Accordingly, it's important for ArcGIS Server administrators to monitor the number of instances their site is running, and to limit running instances when performance is inhibited by memory usage.

Configuring the service’s availability (through its instance minimums and maximums) and the impact of those settings from the user’s demand on the service is key to an optimally running Site. There is a mutual relationship between configuring the service’s availability (through the instance minimums and maximums) and the direct affect that has to requests coming in to a dedicated service. While finding the optimal setting is an ongoing task, there are some tools and resources to help administrators tackle the challenge.

ArcGIS Server Service Report

The ArcGIS Server Service Report (introduced in 10.1) is one of those lesser-known REST Admin API gems. This resource can help monitor a Site by providing a configurable summary of all the services in a folder. It is Generally a fast-performing request (depending on the number of services in the folder being requested).

The instance service statistics section of the returned response is extremely valuable as it lists details on the ArcSOC instances (min, max, busy) across the whole deployment (e.g., the ArcGIS Server Site).

By periodically polling this endpoint, one can get up-to-the-second insight of the service instance configuration vs the demand…as it happens. With such information, better decisions can be made about optimizing machine and service resources. In turn, this can help improving response times and lower wait times.

Note: In ArcGIS Server, instance service statistics information and the Statistics page in Manager are actually different resources though they are providing similar views of the same data. The service statistics provide raw access to the instance values (Site-wide and per machine) as well as more detail. The Statistics page is an interface for creating reports from some of that information.

Automating the Service Report Collection with Soccer

Any script, program or tool that regularly observes the Service Report endpoint of the folder of interest would suffice. However, if you are looking for a free, existing tool then  Soccer is recommended.

(Arc)SOC ScannER or Soccer is a utility for scanning and reading the services' statistics on a specific ArcGIS Server folder. It parses the collected data and writes it out to a CSV file for additional post-capture analysis (e.g., creating charts in a spreadsheet to visual the usage). It takes advantage of the REST Admin's Service Report resource in ArcGIS Server to gather this info. The original goal of soccer was to capture the report endpoint output of a specific folder in ArcGIS Server and save the ArcSOC instance statistics (e.g., Running, Busy, Maximum, etc...) for each service.

Currently, soccer is a command-line only utility. It is made available in the .NET 6.0 portable runtime for Windows (win-x64), Linux (linux-x64) and macOS (osx-x64).

For simplicity, running soccer only requires 3 inputs (other parameters can be passed in to extend functionality):

soccer.exe -s "[https://ArcGISServer/ServerWebAdaptor]" -f [FolderToScan] -t "[PreGeneratedArcGISToken]"

For example:

soccer.exe -s "https://gisserver.domain.com/server" -f "Gas" -t "APLeyWOcKZp9stZ_C01DQ.."

Note: A pre-generated ArcGIS Server token can be obtained from Portal. Typically, the generateToken URL is: https://gisserver.domain.com/portal/sharing/rest/generateToken and the Webapp URL would then be: https://gisserver.domain.com/server/admin. Set the Expiration value to something appropriate for your expected monitoring duration.

Standard Out while running from a command-window:

Connected to: "https://gisserver.domain.com/server " (Gas)

Press Ctrl-C twice to stop...

Sleeping 5 seconds...

When running, soccer connects to the Service Report endpoint of the ArcGIS Server folder specified, collects the data, writes it to a local CSV file, then sleeps. After the sleep duration has elapsed, it repeats the process. Soccer will keep collecting until the process has been manually stopped (Ctrl-C).

Note: To collect on the root ArcGIS Server folder, use either: -f “/” or -f “”

Analyzing the CSV File

Sample contents as seen from a simple text viewer:

 

 

DateTime,Epoch,IntervalSeconds,Host,Folder,ServiceName,Type,Provider,Running,Busy,Maximum,Free,NotCreated,Initializing,Transactions,TotalBusyTime,ServicesCollected,ResponseTimeMilliseconds,ContentLength,ConfiguredState,RealTimeState,Message
5/2/2023 1:12:45 AM,1682989965310,5,gisserver.domain.com,Gas,Gas_Utility_Network,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,121.6224,6065,STARTED,STARTED,success
5/2/2023 1:12:45 AM,1682989965310,5,gisserver.domain.com,Gas,Landbase_PostgreSQL,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,121.6224,6065,STARTED,STARTED,success
5/2/2023 1:12:50 AM,1682989970469,10,gisserver.domain.com,Gas,Gas_Utility_Network,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,117.901,6065,STARTED,STARTED,success
5/2/2023 1:12:50 AM,1682989970469,10,gisserver.domain.com,Gas,Landbase_PostgreSQL,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,117.901,6065,STARTED,STARTED,success
5/2/2023 1:12:55 AM,1682989975606,15,gisserver.domain.com,Gas,Gas_Utility_Network,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,109.6187,6065,STARTED,STARTED,success
5/2/2023 1:12:55 AM,1682989975606,15,gisserver.domain.com,Gas,Landbase_PostgreSQL,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,109.6187,6065,STARTED,STARTED,success
5/2/2023 1:13:00 AM,1682989980733,20,gisserver.domain.com,Gas,Gas_Utility_Network,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,125.1686,6065,STARTED,STARTED,success
5/2/2023 1:13:00 AM,1682989980733,20,gisserver.domain.com,Gas,Landbase_PostgreSQL,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,125.1686,6065,STARTED,STARTED,success
5/2/2023 1:13:05 AM,1682989985874,25,gisserver.domain.com,Gas,Gas_Utility_Network,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,116.3949,6065,STARTED,STARTED,success
5/2/2023 1:13:05 AM,1682989985874,25,gisserver.domain.com,Gas,Landbase_PostgreSQL,MapServer,ArcObjects11,32,0,32,32,0,0,0,0,2,116.3949,6065,STARTED,STARTED,success

 

 

Note: By design, columns such as IntervalSeconds and ResponseTimeMilliseconds will show duplicate values if more than one service exists in the folder being observed.

The collected data is a typical CSV, but there are some important fields which will be helpful for quickly analyzing the ArcSOC activity from out service(s) of interest:

  • IntervalSeconds
  • ServiceName
  • Running
  • Busy
  • Maximum

Soccer_CSV_Output_FilteringByService1.png

By opening the CSV file in a spreadsheet, other services residing in the same folder can be easily filtered out through the ServiceName column. The IntervalSeconds, Running, Busy, Maximum values can then be plotted to visualize (e.g., through the Scatter with Smooth Line chart ) and show the instance configuration versus incoming demand. In this case, the Gas_Utility_Network service was set to min/max instance configuration of 32/32.

Soccer_CSV_Output_Chart_ScatterWithSmoothLines1.png

The "polished" chart below:

ArcSOC_Utilization_Chart1.png

Note: The Maximum and Running represent the ArcSOC service configuration instance maximum and minimum, respectively. In this case, Running has the same values and is plotted “behind” Maximum. Busy represents the number of instances that were active (due to requests from users) across all the machines in the Site.

The “goal” of tuning and optimization in this case, would be to avoid the Busy values from constantly reaching the Maximum. If this were to happen, it means there was not enough ArcSOCs available for the service to meet the user demand as the instances were always busy. User requests would then most likely be encountering increased response times and wait times in the process. If the requests wait too long in the system, they timeout (typically after 60 seconds). Having a lot of service request timeouts would negatively impact the user experience.

Based on the observations from this monitored duration, the rise and fall of the Busy column values did not come anywhere close to the maximum number of instances that were available…which from one perspective was good. However, this also indicated that there was a measurable number of instances running and taking up memory but not being used…which was not ideal.

To optimize system resources going forward, the instance configuration of this service could have been set to a lower value that is closer to the (expected) peak usage (e.g., somewhere between 18 – 24).

  • Use 18 to conserve memory with the potential of encountering several instances where user wait longer for their requests to be fulfilled
  • Use 24 to favor performance over memory usage

Final Thoughts

Having an “optimized” service setting for the minimum and maximum number of instances does not guarantee that users will never encounter slow performance. User needs and habits change over time, so monitoring this information is something that will need to be done periodically.

In the real-world, there are conditions where service wait times can still be encountered in a deployment…even with ample instances available and plenty of system resources. It is not realistic (or possible) to eliminate service wait time completely but is instead more practical to try to reduce it where possible. Optimizing the service instances is something admins can directly control that has an impact on wait times.

Understanding and periodically evaluating the ArcSOC instance configuration (for dedicated services) and its relationship to the user demand is key to helping GIS administrators better plan and manager their Site by optimizing performance and efficiently utilizing resources.

 

 

2 Comments
Jen_Zumbado-Hannibal
Occasional Contributor

Please be gentle as I'm no programmer, yet I'm trying to learn and understand. 

Question: How is this tool different from the script I found, other than the obvious fact it was meant to work with ArcMap 10.8?

https://enterprise.arcgis.com/en/server/10.8/administer/windows/example-export-service-statistics-to...

Can't we just convert this to Python 3.x and run it on Task Scheduler? I would like to import keyring module to load profiles in order to protect passwords from being included in the script. I have no idea how to do this yet, but I'm open to suggestions. 

Thanks. 

 

 

AaronLopez
Esri Contributor

Hi @Jen_Zumbado-Hannibal,

Both tools are similar but also different.

The 10.8.1 python script listed on the arcgis.com page, focuses on retrieving usage statistics (total number of requests, maximum and average response time, and total timed-out requests).

The soccer captures some of that info, but the primary purpose of it is to help the GIS analyst or administrator understand the optimal instance configuration of a service. In other words, the max instance of a dedicated service was set to 12...but was the system able to reach this maximum for the duration of interest? If the captured busy instances data constantly matches or is drastically less than the maximum you can then choose a more optimal configuration to improve scalability or save memory.

Hope that helps.

Aaron