Testing Fundamentals, Meanings and How They Are Used

AaronLopez · ‎07-16-2021

Request
Also known as a sampler

An HTTP request is the "smallest" unit of work you can define a test to perform. Generally, when testing ArcGIS Enterprise, it can be a URL for a resource like a map service, feature service or route solve but it can also be a call for a static object like a *.css or *.js file.
The protocol can be HTTP (plain text) or HTTPS (secured) and the method can be one of many, although GET POST and HEAD are typically the most common.

A dynamic map service request would resemble the following form:

https://yourwebadaptor.domain.com/server/rest/services/NaturalEarth/MapServer/export?bbox=-130.9656801129776%2C18.608785315857112%2C-57.52504741730332%2C52.34557596043248&bboxSR=4326&imageSR=4326&size=1920%2C882&dpi=96&format=png32&transparent=true&layers=show%3A15%2C16%2C17%2C19%2C20%2C21%2C22%2C23%2C24%2C25%2C26%2C27%2C28%2C29%2C30%2C31%2C32%2C33%2C34%2C35&f=image

The same URL as an Apache JMeter HTTP request:

What a static request would look like:

https://yourwebadaptor.domain.com/portal/home/10.9.0/js/jsapi/dojo/dojo.js

Apache JMeter also makes the distinction between a request and a sampler, though they are both defining an action to perform. A sampler, it would be the execution of a process at the Operating System level that performs some type of action like running a geoprocessing tool to create a file geodatabase or to create of a new SDE version in an enterprise geodatabase. Every test has to have at least one request or sampler.

Another type of sampler is a web socket. While like an HTTP request in that it makes a call over the "web" and can be secured, it uses a different protocol for communicating with the remote server as well as different parameters to specify parameter options.

Transaction

A transaction is a logical grouping of one or more http requests. The requests can dynamic and/or static. Together, these requests typically make up one user operation, for example:

The loading of web app
A navigation action like a pan or zoom
A search function
Creation of a new SDE Version within an Enterprise GeoDatabase

It is not a technical requirement to use transactions in a test , but doing so can greatly enhance the analysis as individual operations (e.g. transactions) could then be isolated to show their respective performance behaviors throughout the run. This can be very informative.
Understanding that only requests for map scale 1:72,224 had performance problems is very useful from a tuning perspective as you would know exactly what areas of the map document or project would need to be
adjusted...transactions can help you accomplish this.

Apache JMeter Transaction containing three requests from one operation:

Test
Also known as test plan or test project

The term "test" is rather generic and is often used as both a noun (I created a test to call the resource) and verb (I am going to test the service). Transactions and requests are usually defined in a test. The test will have additional options to configure such as: how long the test will run for, where the results go, should metrics on the remote servers be collected.
Different frameworks use slightly different terminology for describing a test. In Apache JMeter's case, a test or test project is called a Test Plan and is designated with a *.jmx file extension.

Step Load
Also known as load

The step load is a characteristic that defines how long and how many concurrent test threads to apply during the test through even, incrementing pressure (e.g. similar to a staircase). Configuring the test for a step load is helpful for understanding how a map service performs or scales or how deployment resources behave
as more and more requests are thrown at it. The defined pressure can also decrease (toward the end of the test) but do not have to.

Apache JMeter Thread Group (bzm - Concurrency) specifying and visualizing a specific step load:

Constant Load

A constant load also defines how long and how many test threads to apply but is usually set for a steady rate over long periods of time. Instead of focusing on performance and scalability this configuration is typically for understanding durability and stability.

Apache JMeter Thread Group (bzm - Concurrency) specifying and visualizing a specific constant load:

Test Threads
Also known as threads

This is mechanism responsible for applying load by taking the defined work to be done in the test such as the transactions and/or requests and executing them repeatedly.

Test threads typically behave in a serial fashion where each thread starts by reading the the first request defined in the test, sends that to the server then awaits its response. The next request in the test will not be issued until a response comes back from the server or a timeout has elapsed. Once one of these conditions is met it moves to the next request. Most tests are configured to have each test thread repeat this process continuously for the duration of the run.
Various technologies often refer to test threads as virtual users but this can be misleading. The test threads of a test are just the means (pressure) to an end (delivered throughput).
In other words, the execution of a test that is configured with a step load that reaches 100 test threads does not mean the environment is supporting 100 concurrent, virtual users. In this case, determining users would be calculated off the test's throughput; transactions/sec, for example.

Apache JMeter Thread Group (bzm - Concurrency) defining the step load via (test) threads:

Users
Also known as virtual users

The number of supported users is one of the most requested items to determine from a load test and usually takes the form of:

How many users will this specific service or application support?
Will a particular service or application support at least X users?

The calculation of users is closely tied to think time as well as measured test artifacts such as throughput and response time.
Using Little Law with these inputs can provide a theoretical estimate to the number of users an environment can support.

Think Time
Also known as workflow pacing

Think time is a duration (defined as seconds or milliseconds) that is added into a test to simulate the delays of human behavior that would occur from a person naturally interacting with the map service or web application.
Think time delays can be added to transactions (e.g. an operation) or requests or even to the test itself (which is then referred to as workflow pacing). How they are added can vary based on the testing framework involved.
In Apache JMeter's case, there are several different timers available that can be added to the test to simulate various types of delays.

Key Performance indicators (KPIs)

KPIs are test metrics that assist with the analysis of a load test. Some of the most popular ones are associated with measuring the response time and throughput of the test. However, they also extend to items that count the number of failed requests, count the average content length (per request) or that collect information on hardware utilization (such as CPU, memory, network and disk).
Although the ability to capture hardware utilization often requires additional test configuration and permissions within the environment, this information is one of the most important artifacts captured from a load test.

Note: Captured hardware utilization is one of the most important artifacts captured from a load test.

Response Time

Response time is a common metric that is used to measure the performance of a request, transaction or test. Simply put, it provides an understanding to how fast an operation is behaving.
The value is typically presented in seconds or milliseconds. Faster performance means lower response times which translates to a more favorable user experience. Response time can be plotted over the duration of the test to understand how performance scaled or listed together with throughput for a particular point in the test (e.g. where throughput peaked).

Note: Response times are one of the most important artifacts captured from a load test.

Ideally, the performance of the item being tested will take on the following curve where the response times will climb more quickly (around the point of peak throughput). In the following example, the average request response time at peak throughput was about 0.4 seconds.

Throughput:

Throughput is a common metric that is used to measure the scalability of a map service, web application or hardware infrastructure. Essentially, it provides an understanding to the rate at which an operation can be conducted over a duration of time.
The value can be usually captured as requests/sec, transactions/sec (e.g. operations/sec) or tests/sec, though it is often expressed over the duration of an hour (the rate in seconds multiplied by 3600).
Higher scalability means more throughput which translates to support for more users.
Some test analysis will focus on the average throughput of all transactions for a test, while others might examine the average throughput for each individual operation.

Note: Throughput is one of the most important artifacts captured from a load test.

Ideally, the throughput of the item being tested will resemble the following curve where it reaches a peak then plateaus. When throughput peaks and/or plateaus it suggests that the test has encountered some form of a bottleneck. In the following example, the average request throughput at peak was about 24 requests/second (or 86,400 requests/hour).

Bottleneck

A bottleneck is a condition of a deployment where one of the its components or tiers is limiting the rate at which it can respond to incoming requests. A bottleneck can take the form of:

Hardware Examples
- All of the CPU cores of ArcGIS Server are fully utilized
- Available Memory is exhausted
- Storage disk I/O of the Database server is fully utilized
- Network card is saturated
  - Due to Send or Received traffic
Software Examples
- The database was configured to only allow 25 current connections despite having ample hardware resources available
- Throughput for consuming a map service plateaus but ArcGIS Server CPU utilization does not increase above 25%

A bottleneck will always exist in a deployment and determining which component will restricts first is part of analysis. It will often take a load test to expose where the first bottleneck will occur since it may only be observed under a large amount of pressure. While server resources and settings are typically the focus of bottleneck analysis, test client resources (CPU, memory, network, disk and in some cases the testing license) can also be a factor. Reaching a bottleneck is not necessarily a problem, it just lets you know where the first weakness or limitation is within the system. Sometimes a bottleneck is considered a “good thing”, for example running a large ArcGIS caching process, it is desired that the CPU becomes the first bottleneck because it is doing the work to create the map tiles. If the CPU can only reach 50% because of another bottleneck (e.g., disk I/O), it will take twice as long for the job to finish relative to 100% CPU utilization.

Note: A bottleneck always exists in a deployment

Test Type
Also known as a performance test, load test, stress test, endurance test, benchmark test

Many organizations often use different categories to classify the testing being carried out.

A performance test is typically utilized to troubleshoot issues with a service or application when it is behaving slowly or produce longer than expected response times. They do not need to involve a step load and could be conveniently executed as a single user directly interacting from a web browser with the endpoint of interest.

A load test can often be used to describe a step load test with a goal of meeting a particular throughput and response time goal. For example, X transactions/sec with a response time under Y seconds and no failures. This might result in the exhaustion of one of the server hardware resources but that is usually not the goal. A load test can also be referred to as a scalability test.

A stress test is a similar test but is frequently focused on reaching a pressure that is a multiple load test's goal. In other words, if the load test was trying to reach X transactions/sec, the stress test might try to reach X * 5 transactions/sec without encountering a significant amount of failures.

An endurance test has the distinction of trying to break components of the system. Its applied load can be a multiple of the stress test's where the goal is to encounter significant errors and observe the throughput and response time when they occur. An endurance test can also be referred to as a durability test where the applied load is constant for a very long duration and hardware utilization and reclamation patterns are observed.

Test Plan

In the general sense, a test plan is document, table or list which defines the specific tests that will be executed as well as their respective goals. These goals are the reason and purpose of each the test. The analysis of the results (by hand or from generated test reports) should help you determine whether or not the goals of each test were achieved.

Testing Framework

The testing framework is the tool or technology used in the form of libraries, APIs as well as a graphical user interface (GUI) for assembling requests, and the test as well as defining the load to be applied.
There are many great testing frameworks out there and Apache JMeter is just one of them. While they are all similar in purpose, many of them take different approaches to the vocabulary of certain components and how they create a test and apply load. Some put the definition of the requests and transactions into their own files with the step load configuration in another.
With Apache JMeter, all of the test objects are defined in the Test Plan and are logically separated within the tree.

Some load testing framework examples:

Some performance testing framework examples:

wget
- A command line tool for retrieving one or more URLs
- Can provides a high level of detail on each request and response
curl
- A command line tool for retrieving one or more URLs
- Can provides a high level of detail on each request and response
Fiddler
- GUI-based HTTP Debugger that can be used alone or with a web browser
- Can provides a high level of detail on each request and response

Testing Framework Architecture

When testing ArcGIS Enterprise most of the architectural attention centers around scalability of the deployment tiers: Load Balancer, Web Adaptor, Portal for ArcGIS, ArcGIS DataStore, ArcGIS Server, Enterprise Geodatabase and Network Storage. While one 8 Core test machine can usually send a fair amount of requests that can satisfy the typical test, sometimes multiple machines are needed if the load to apply requires serious horsepower.

Depending on the testing framework involved, several of the testing components can be separated out to different machines to improve the scalability of the test client.

Common components to scale out are:

Test Controller
- As the name implies, the main focus of the controller is to stop and start the test as well as coordinate the collection of test metrics from one or more Test Agents
- In Apache JMeter's case, the controller is integrated right into the GUI but is also running when the test is executed from the command line
  - Other testing frameworks may have a web-based Test Controller frontend
- Typically, only one Test Controller is needed for any given test environment, but it can run on dedicated hardware that is separate from the Test Agents
Test Agent
- The primary job of the Test Agent is to send requests and receive responses from the server
  - This component performs most of the work and would require the most CPU resources
- For big jobs, multiple Test Agents machines might be needed
- In Apache JMeter's case, by default, the Test Agent runs on the same machine as the Test Controller
Test Repository
- A machine dedicated to storing the load test results
  - This can include test metrics like response time, throughput and hardware utilization
- In Apache JMeter's case, the results are stored on the controller in text (*.JTL) files
  - It is possible to send the results to a database, but this is not the default
Test Visualization
- A machine used to visualize the test metrics and hardware utilization in real-time
- In Apache JMeter's case, the GUI is not recommend for the data visualization of a production test run but the command-line is
  - If results are sent to a database, additional software can connect to Test Repository to visualize the information

Interactive Response Time Law

The Interactive Response Time Law is a formula that defines the relationship between key performance factors, namely users, throughput, response time, and user think time. The calculation can be arranged to determine the parameter of interest as long as you know the other three. For example, if the number of users utilizing the system is known, what the average response time is for requests, and the average user think time, we can then derive the estimated throughput demand on the system. This law is very useful when attempting to convert users to throughput and throughput to users and other use cases as well and is foundational to areas related to testing such as capacity planning.

Given the following formula:
N = X * (R + Z)

N = Number of jobs or concurrent users
X = Throughput per second in the system
R = Response time, or average time a job spends in the system
Z = Think time

For more information on the Interactive Response Time Law see:

Apache JMeter released under the Apache License 2.0. Apache, Apache JMeter, JMeter, the Apache feather, and the Apache JMeter logo are trademarks of the Apache Software Foundation.

AaronLopez · ‎07-18-2021

Updates to following sections:

Testing Framework
Bottleneck
Interactive Response Time Law

Additions of the following sections:

Testing Framework Architecture