SRajagopalan-esristaff

ArcGIS Monitor for Administrators – Diagnosing System Health

Blog Post created by SRajagopalan-esristaff Employee on Apr 3, 2019

Multiple metrics are used to define the health of your enterprise GIS. ArcGIS Monitor was created for administrators to quickly identify and deal with problems with ArcGIS, whether they are in the software or elsewhere in the stack. To properly diagnose if your GIS systems are healthy, administrators need to be aware of when alerts happen and their criticality as well as understand their root cause.

 

Alerts

When the system is not healthy, there might several hundreds of alerts in your dashboard. To help better analyze the alerts, ArcGIS Monitor divides them into three categories – critical, warning and info. Addressing critical alerts is top priority for every administrator as they impact availability and the smooth running of your enterprise GIS. Warning alerts indicate resources running low – memory, disk, CPU or network bandwidth. Info alerts are logs that are informational for the administrator.

 

There are several options to investigate errors – you can view when the error occurred, parse log entries for that time or click on the included log errors links and admin URLs to check site details.

 

Root Cause Analysis

While administrators need to know when alerts happen, it is also essential that they understand the root cause, the source and the impact, of a problem. For example, an outage of an ArcGIS Data Store impacts all of the tiers above it. The source, in this case, would be the ADS and the impact would be ArcGIS Server and portal sites affected by the outage.

 

One of the common root causes is system overload. When the system receives loads exceeding its capacity, this results in excessive resource utilization such as 100% CPU, zero free memory, or zero idle disk. This, in turn, lowers performance, causes time outs and impacts overall stability of the enterprise implementation. 

 

Another common root cause is system bottlenecks, which impact performance and stability while the resource utilization is low. Bottlenecks manifest during increased user load such as the above case.

 

Lastly, unstable infrastructure is another cause to look for. Restarting services, changing permissions, expired passwords or virtualization overallocation can impact system stability. Examples include unexpected processes consuming memory, CPU usage spikes, stopped ArcGIS Server services, reboot conditions and databases not running.

 

ArcGIS Monitor provides reports that speak the language of administrators, enabling easy diagnosis of the health of your enterprise GIS, and manage GIS hardware and infrastructure needs. Monitor shows you where the issues are through quantifiable key performance indicators and metrics.

 

This video demonstrates the above mentioned key features.

Outcomes