Skip navigation
All Places > Implementing ArcGIS > Blog

Return on Investment, commonly abbreviated as ROI, is an important goal for any GIS project that should be calculated, documented, and shared.  Check out this Story Map to learn more about GIS ROI and view many real-world examples.

 

https://arcg.is/1Wi0CO 

If you're headed to the 2019 Esri International User Conference and are interested in sessions for GIS Managers, here is a link to the GIS Manager Track:

 

https://userconference2019.schedule.esri.com/schedule?filters=1964269831

Are you a GIS manager, leader or other executive headed to the 2019 Esri International User Conference (UC)?  I know it can be a challenge creating your personal agenda for the world's largest GIS conference, so I created this flier to assist.  It covers suggested events and activities you should consider when deciding how to spend your valuable time at UC.  I hope you have a productive UC experience, and I hope to see you there!

 

UPDATED 6/24/2019 - Added GIS Manager Track and Get Advice from Esri Services section.

 

UPDATED 6/13/2019 - Corrected the name of the Implementing ArcGIS area in the Expo to “Guiding your Geospatial Journey”

 

FYI there are other Esri UC fliers here: https://community.esri.com/community/events/user-conference/content?filterID=contentstatus%5Bpublished%5D~category%5B2019-uc-industry-fliers%5D

Introduction

To cloud or not to cloud is the question that many organizations are currently facing. While on-premise data center technology is not necessarily on the verge of extinction, cloud computing is an option with many benefits including scalability, agility and cost efficiency.

 

The ArcGIS platform is supported on both on-premises or in a cloud environment like Microsoft Azure or Amazon Web Services (AWS). By leveraging these components, you can expose GIS content and capabilities as web services and consume those services in your apps. This enables users to access and apply useful GIS resources in their work.

Moving to the cloud is more than just upgrading your servers and software. It represents a radical change in the way that you manage technology. This shift gives you a unique opportunity to align technology with your overall business vision, which in turn can help you grow your business, improve productivity, and reduce costs.

 

Before starting, it is a must to document your current infrastructure, this includes compiling a list of your servers, storage systems, network components, off the shelf software, bespoke software, and subscriptions to gain a full picture of your current technology. This information is critical for helping you determine the best path forward.

 

Analyzing your current infrastructure against your user workflows is the next step.  From assessing your desktop workflows to assessing your integrations, your security policies, SLAs, data storage requirements, authorization, and authentication providers all these points will play a huge factor in designing your cloud infrastructure. We will discuss these topics briefly in this article and will go into the specifics of each separately in future blogs posts.

 

General Database Considerations

Moving your Geodatabase to the cloud is a big step, you need to consider where your clients accessing the data exist, are they running from the cloud or on-premise, and what is the type of work is being performed (read-only, frequent edits, etc.).

 

General Network Bandwidth and Speed

Moving all the infrastructure to the cloud will need a resilient and reliable Internet connection, most importantly with low latency. Low bandwidth can be a serious problem when you factor in a large number of employees sharing the same network. Sometimes cloud solutions don’t offer enough bandwidth, especially when it comes to uploading larger files such as raw satellite imagery. However, latency is the primary concern as network latency is the key factor in determining cloud architecture decisions, for example, because of the latency it is likely not practical to keep desktops on premise and move the database into the cloud.

 

Assessing ArcGIS Desktop Workflows

First and Foremost, do you need to provide this functionality through ArcGIS Pro or ArcGIS Desktop, or can we have it replaced via a web-based app? That would save us setting up a dedicated machine or application instance on the cloud.

Is the data you are processing 2D or 3D? ArcGIS Pro required a GPU and the resourcces are even more demanding if processing 3D data. GPU enabled machines on the cloud cost considerably more than machines without a GPU. Do you have advanced Geoprocessing workflows that require hefty computing power, do these workflows run overnight? Check out https://pro.arcgis.com/en/pro-app/get-started/virtualization-overview.htm for more information about running ArcGIS Pro on the cloud.

 

ArcGIS Server Sites and Portal for ArcGIS Sites

The least difficult method for starting to use the cloud for hosting your own software is perform a lift-and-shift migration, where you move software currently running on-premises to the cloud host. The behavior of the software is precisely the same after moving into the cloud however now it is running with the benefits of underlying cloud infrastructure, such as having affordable and reliable virtual machines that don’t require high upfront purchasing costs and little ongoing maintenance of the infrastructure.

 

Using specialized deployment tools can make it easier to install and configure the software on certain cloud platforms. You can also create your own machine images to host Esri software in cloud environments. These deployment tools don’t just install and configure the software; they also provision and set up the underlying infrastructure, including the virtual machines, load balancers, networking, and storage.

 

Bespoke GIS Applications

Whether you have Geocentric applications, Geo-enabled applications, or Composite applications you will need to reassess the architecture of these applications and ensure the compatibility with the new cloud environment. check out developers.arcgis.com for information about our wide coverage of SDKs and APIs.

No single application integration pattern fits all situations. You can use the application pattern that best combines capabilities from ArcGIS and your business system to deliver the greatest impact.

 

Integration to Other Systems

Application integration lets you deliver solutions that combine data and tools from different systems—including your GIS as well as business systems like permitting, licensing, and asset management systems. With integrated solutions, you can improve cross-functional business processes and provide decision-makers with integrated views of your organization’s information. Insuring that these systems integrate prevents information duplication and enables having the right data available when you need it. You need the option of deploying your integration technology so that it works in both environments. To decide whether your integration technology will reside in the cloud or in a rack in your data center, you have to start with the specific requirements of your business.

 

Security

Security is not only an important decision to avoid becoming a statistic, but also one that may be government regulated. Depending on your industry, market, and location, you may have to abide by an array of rules determining how you use and store sensitive data. Cloud security has come a long way, though, and is arguably as secure as most private data centers. In either case, security and regulations of your industry are something to think about when considering cloud and on-premise. With either option, security is granularity configurable to meet your security needs. Consider Esri best practices for configuring secure ArcGIS Server and Portal for ArcGIS environments as well as your organization's needs and policies when designing your ArcGIS Enterprise site security.

 

High Availability and Redundancy

Designing the HA environments in Azure or AWS are somewhat the same design principles as of on-premises however more functionality is provided by cloud providers such as scalability on demand. You can configure your site so that ArcGIS Server machines are added in response to certain triggers, such as CPU usage. New servers can be created in a matter of minutes, allowing your site to gracefully respond to abrupt spikes in traffic. When you no longer need the instances, you can destroy them and incur no further infrastructure charges for them. The installation is much easier as now there are images already provided for faster installation times.

 

Backups and Disaster Recovery

Cloud computing has led to a new way of preparing for IT disasters by providing secondary environments for backing up and restoring data and failing over business applications. These disaster recovery services are cost-effective and straightforward to set up. ArcGIS Server and Portal for ArcGIS include utilities that you can use to create backups and restore your sites on Azure and AWS.

 

Access Management Providers

When implementing a cloud-based application it can be a challenge to ensure that information security and access management controls are applied to the cloud application.

 

Organizations should ensure that the single sign-on (SSO) access management is implemented in the cloud. In particular, organizations need to understand how employees can seamlessly access the various cloud-based applications using their existing access management protocol.

 

Portal for ArcGIS is compliant with SAML 2.0 and integrates with identity providers that support SAML 2 Web Single Sign-On. The advantage of setting up SAML is that you do not need to create additional logins for users to access your ArcGIS Enterprise portal; instead, they use the login that is already set up in an enterprise identity store. For example, Microsoft Azure Active Directory is a SAML-compliant identity provider. You can configure it as your identity provider for enterprise logins in Portal for ArcGIS on-premises and in the cloud.

 

I hope you find this helpful, do not hesitate to post your questions here: - Arcgis Architecture Series : Moving to the Cloud

If you want to learn how to redefine yourself from mapmaker to solution provider, I will be presenting The Underutilization of GIS & How To Cure It in a webinar on Tu June 18 at 10AM Pacific:

 

https://www.esri.com/en-us/landing-page/industry/government/2019/underutilization-of-gis-and-how-to-cure-it

Note: This blog post is the third in the series of three planned posts about egdbhealth.  The first described introduced the tool, how to install it, and how to run it.  The second in the series addressed how to use the tool to evaluate the health of an Enterprise Geodatabase; its primary purpose.  This article addresses using egdbhealth in a system design context.

Introduction

Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  The primary purpose of the tool is to evaluate the “health” of eGDBes.  However, the output can also be used in a system design context.  This article addresses the system design use case. 

 

For information about installing and running the tool, please refer to the first blog post in this series "What is Egdbhealth?"

 

Information Objectives for System Design

The Esri system design practice focuses on planning the hardware, software, and network characteristics for the future state of systems based on new or changing requirements. 

 

The current health of an existing system will not necessarily have a strong relationship to a future system that has different requirements.  However, depending on the design objectives, information about the current system can be relevant. 

 

For example, in the case of a planned migration from an on-premises system to a cloud platform, it would be quite useful to describe the current system such that it can be faithfully rendered on a cloud platform.  Or, requirements driving a design may indicate a need to exchange (replicate, move, synchronize, transform, copy, etc.) large portions of a Geodatabase to another repository.  In that case, an inventory of the Geodatabase content can be useful to establish important details about the nature and quantity of data that would need to be exchanged and the optimal ways for that data exchange to occur. 

 

Thus, at a high level, it is the primarily the descriptive information from egdbhealth, rather than the evaluative information, which is pertinent to system design.

 

This article discusses examples for a SQL Server backed eGDB.  However, the principles are the same for eGDBes that are backed by Oracle and PostgreSQL.

 

Machine Resources and Utilization

For system design cases where the eGDB machine platform will change, it can be useful to understand the current machine resources that support the eGDB and their degree of utilization.  For example, if you are migrating an eGDB to a cloud platform, the number of processor cores that the system has on premises has some relevance to the number you might deploy on the cloud.

 

The Machine

The HTML metadata file produced by egdbhealth will provide resource information about the machine platform on which the RDBMS runs. 

 

In the example below, the SQL Server instance runs on a machine with one processor which presents 8 logical processor cores. 

 

Machine resources in HTML metadata

 

This information from SQL Server leaves some information unclear.  For example, SQL Server is not able to report on whether the logical cores are a result of hyperthreading.  If they are, then the physical cores would be half of the logical cores (4 physical cores).  And, it is the number of physical cores that is most useful for capacity planning in system design.  So, this information is good but imperfect.

 

The HTML also reports the physical memory on the machine (16GiB) and the minimum and maximum memory configured for the SQL Server instance (0 and 3GB).  Here we see that the configuration of the SQL Server instance is quite relevant to understanding the relevant memory resources for capacity planning.  Although the machine has 16GiB of memory, this SQL Server instance (and the eGDB it supports) does not have access to all of that memory. 

 

The Utilization

The characteristics of the machine, and the configuration of the instance, offers incomplete insight into the degree to which the machine resources are utilized and what resources are truly needed for the current workload. 

We can be certain that SQL Server does not use more than 3GB of memory.  But, does it use it thoroughly?  And, what about the processor resources; are the busy or idle?

 

Much of the utilization information appears in the Expert Excel file because it is relatively volatile.

 

Processor

In the case of SQL Server, there is an evaluation in the Expert Excel file called “ProcessorUtil” (Processor Utilization; the Category is “Instance”).  It provides a sample of the recent per-minute processor utilization by SQL Server on the machine.  In this case, we can see that, for the sampling period, SQL Server uses almost none of the total processor resources of the machine on which it runs.

 

Processor utilization

 

From this, we infer that, if this period is typical, the existing machine has many more processor resources than it needs.  If all else is equal, the future system design can specify fewer processor cores to support the eGDB workload.  Naturally, for this inference to have any integrity, you must run the tool during typical and/or peak workload periods.

 

There is another evaluation, CpuByDb (Instance category), which reports which databases in the instance consume the most processor resources.  So, in the case that the instance did have significant processor utilization, this information could be used to determine whether that processor utilization related to the eGDB of interest or some other workload in the same SQL Server instance.

 

Processor utilization by database

 

Memory

In the case of SQL Server, there are several evaluations which provide insight into the memory utilization inside the instance.  Most of these are in the Instance category of the Expert Excel file.

 

MemoryByDb allows you to see the amount of memory (in this case, rows of data as opposed to SQL statements or other memory consumption) used by each database.

 

Memory (database pages) utilization by database

 

In this case, we see that the “GIS” eGDB is the primary memory consumer in the SQL Server instance.  So, it appears reasonable to understand that the eGDB can make use of the better part of the 3GB of instance memory.  But, is that too much or not enough for good performance and system health?

 

The MemoryMetrics evaluation reports on several memory metrics that can offer hints about how much in-demand the instance memory is.  In this case, we see that there are warnings that suggest that the existing memory available the instance may be too low for the workload.  Thus, we do not know if 16GiB of machine memory is too much or too little.  But, we do have reason to believe that 3GB of instance memory is too little.

 

Memory metrics

 

There are several other memory evaluations in the Expert Excel file.  It takes experience and judgement to integrate these observations into an informed judgement about the true memory needs of the system.   In this case, as in many cases, the inferences that one can make are incomplete.  But, they are better than nothing.

 

Storage

SQL Server does not provide complete information about the local storage on the machine on which it runs.  However, it does provide information about the size of the files in which the eGDB is stored.  Much of this information is available in the “Database” category of the Expert Excel file.

 

The FileSpace evaluation sheet reports on the files in which the eGDB is stored.  In this example, we see that there is a single data file (“ROWS”) which is about 6GB in size and a single transaction log file (“LOG”) which has used about 650MB in the past.  You can also see the maximum file limits established by the DBA, presumably being an indication of the maximum expected storage for the eGDB.

 

Database file sizes

 

These data are good indications of the amount of storage that would be needed for a “lift and shift” scenario where the entire eGDB will need to move to a different platform.

 

Data Inventory and Data Exchange

Another system design case is eGDB data needs to be exchanged with a different repository, perhaps another Geodatabase in a different data center.  In these cases, it may be relevant to establish the storage size and other characteristics of the data that will be exchanged, as opposed to the storage size of the entire eGDB.

 

Record Counts and Storage By ObjectClass

If you know the identity of the ObjectClasses to be exchanged there is information in the Content Excel file to report on the storage size and record counts of those item. 

 

In the Storage category, there is a sheet called “ObjClass” (ObjectClasses).  It reports, in considerable detail, the tables from which various eGDB ObjectClasses are composed, the number of rows that they contain, and the amount of storage that is allocated to each of them. 

 

In the example below, there are three ObjectClasses of interest (filtered in the first column): BUILDINGS_NV, COUNTRIES, and SSFITTING.  BUILDINGS_NV has about 4,400 records and consumes just over 1.25MB of space.  COUNTRIES is versioned, with 249 records in the base table, 66 adds, and 5 deletes (the TableName column calls out the “A” and “D” table names for versioned ObjectClasses).

 

ObjectClass records and storage

 

Thus, we can establish the record counts and approximate storage characteristics of ObjectClasses of interest in the eGDB.  At the same time, we can also see which ObjectClasses are registered as versioned.  This is critical information for establishing the range of options available for data exchange.

 

Versioning, Archiving, and Replication Status

In the same Context Excel file, the ObjClassOverview (Inventory category) provides an overview of the versioning, archiving, and replication statuses of all of the ObjectClasses in the eGDB.  This provides a more convenient mechanism for planning the range of data interchange mechanisms that would be appropriate for different ObjectClasses.

 

ObjectClass overview

 

Distributed Data Design Consideration

The ArcGIS documentation refers to this thematic area as “distributed data” and offers general guidance about the various strategies that can be used to move data around: http://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/understanding-distributed-data.htm.

 

Versioned Data and Geodatabase Replication

If an ObjectClass is not already registered as versioned it may or may not be appropriate to do so for the purposes of making use of Geodatabase Replication.  This relates to how the data is updated.  Most data that is registered as versioned is updated “transactionally”, meaning that only a small percentage of the total records are changed at any given time.  This kind of updating is compatible both with versioning and Geodatabase Replication.  Other data is updated with methods like “truncate and add”, “delete and replace”, or “copy and paste”.  In these cases, most of the records are updated, physically (even if the majority of record values happen to be the same, the physical records have been replaced).  That kind of updating is not particularly well-suited to Geodatabase Replication.  The information in egdbhealth can tell you about the versioning and replication status of ObjectClasses.  But, it does not know how those ObjectClasses are actually updated.  Hopefully, those that are registered as versioned are transactionally updated and therefore suitable candidates for Geodatabase Replication.

 

Archiving Data and Data Exchange

If an ObjectClass has archiving enabled, there may be additional considerations.  For example, if the planned data exchange might increase the rate of updates, this could cause the quantity of archive information to grow in ways that are undesirable. 

 

For versioned ObjectClasses that are archive-enabled, the archive information is stored in a separate “history” table.  The size of that table does not impact the query performance of the versioned data.  In this case, the main design consideration is whether it is practical to persist and use all of the archiving information.  In many cases, it will be.  But, making the determination requires knowing the archiving status of the ObjectClass and the planned rate of updates.

 

In the case of non-versioned ObjectClasses that are archive-enabled (and branch-versioned ObjectClasses), the archive information is in the main table.  Thus, rapid growth of archive information can change the performance tuning needs of ObjectClasses and/or eGDB.  Egdbhealth cannot know the future update needs for the data.  However, it can report on the historical patterns of updates to non-versioned, archive-enabled ObjectClasses.  The Context Excel file contains a “NvArchEditDatesSql” sheet that contains SQL statements for each non-versioned, archive-enabled ObjectClass in the eGDB.  These statements are not run automatically because they are resource-intensive queries that do not provide information which is relevant to all circumstance. 

 

Additional SQL statements you can run

 

In this system design case, however, establishing the rate at which record have accumulated over time may be quite pertinent to planning a data exchange strategy.  Use a SQL client such as SQL Server Management Studio to run the queries for the ObjectClasses in question to get a sense of the rate of data update.  In this example, we can see a pattern for record updates where the typical update is small (1 record), but there was at least one occasion where there was a large update relative to the size of the ObjectClass (289 records on 8 April 2018):

 

Executed SQL statement

 

Attachment Size and Mobile Synchronization

Another design case which relates to distributed data is field-based data collection and attachments.  Collector for ArcGIS makes use of efficient mechanisms that are similar to Geodatabase Replication to synchronize “deltas” between client and server.  However, the ability to collect large quantities of data in attachments (such as images) and the often limited network conditions in which the data can be synchronized can raise system design issues.

 

The Expert Excel file has an evaluation “LargeAttachments” that will automatically bring cases of likely concern to your attention.  In the example below, the threshold values were not violated, so there was no negative evaluation:

 

There may be no records when there is no negative evaluation

 

However, it may still be useful to know something about the nature of the existing attachments.  The Context Excel file contains a sheet “AttachmentSizeSql” (“Inventory” category) that contains several SQL statements for each ObjectClass that has attachments. 

 

Continue to explore with additional SQL statements

 

Running these SQL statements in a client such as SQL Server Management Studio allows you to statistically characterize the attachments in various ways: aggregate statistics, top 100 by size, and percentile distribution. 

 

Additional SQL statements executed

 

With these statistics, you can understand not only the total amount of information, but whether it is due to a large number of small attachments or a small number of large attachments.  Large attachments might be a greater risk for synchronizing over poor network conditions and might lead you to make a recommendation that data collection operate in a disconnected mode and synchronization occur under specific network conditions only.

 

Attachment size considerations might also lead you to offer recommendations about whether and how it would be practical to handle two-way synchronization (as opposed to one-way).  While field-based users might have sufficient network circumstances to upload the attachments that they collect individually, two-way synchronization would mean that all of the attachments collected by other individuals (in the area of synchronization) would have to be downloaded.  In some cases, this could be many times the amount of information that would need to be uploaded.

 

Summary

As a designer of GIS systems, it is not practical for you to have profound expertise in all of the technology areas that pertain to the system.  For this reason, your understanding of egdbhealth outputs and what you can do with them has limits.  The purpose of this article is to identify some of the most common use cases for egdbhealth outputs for system design.  The main areas of interest are (1) machine sizing / resource utilization and (2) data characteristics relevant to data interchange.  Following the examples in this article, and generalizing from them for the Oracle and PostgreSQL databases, will allow you to extract useful information from the egdbhealth tool for your design work.

 

I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

Note: This blog post is the second in a series of three planned posts about egdbhealth.  The first in the series described what the tool is, how to install it, and how to execute it.  The third in the series will address using egdbhealth in a system design context.

Introduction

Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  This article discusses how to use the outputs of egdbhealth to evaluate the health of an eGDB.  All of the examples use a SQL Server-backed eGDB.  However, similar principles apply to using the tool with Oracle- and PostgreSQL-backed eGDBes.

 

For information about installing and running the tool (i.e. creating the outputs), please refer to the first blog post in this series, "What is Egdbhealth?"

 

Viewing and Understanding Findings

The Expert Excel file contains an “OVERVIEW_EXPERT” sheet that allows you to see the evaluations at a high-level and prioritize your review of the findings.

 

Expert Excel file Overview sheet

 

This article will not describe all of the evaluations and their various meanings.  There are too many for that to be practical.  Instead, the article describes the process and provides specific examples to illustrate the kinds of benefits that can be gained.

 

Criticals

The red-filled cells “Criticals” column should be viewed first.  These findings are highlighted as top concerns in their respective Categories. 

 

For example, in the screen capture above, “Backups” is flagged as a critical concern.  Click on the hyperlinked cell to view the sheet with the detailed information.

 

Critical: no backups exist

 

In this case, the worksheet has a single record that says that the database has never been backed-up.    This is a critical concern because if the system fails, data can be lost.  There is also a hyperlink back to the “OVERVIEW_EXPERT” worksheet.  This “Back to OVERVIEW” link appears in every detail worksheet to ease navigation.

 

In the example below, “Memory Pressure”, the detail worksheet displays memory pressure warnings reported by SQL Server.  When the RDBMS reports that there is memory pressure, it is an indication that there is, or soon will be, performance and/or scalability problems.

 

The Comments column (always found on the far right) describes the issue and the recommended course of action at a high level.  Note that the amount of information reported is much greater than the “BackUp” example (more columns) and that the information is of a highly technical nature, requiring specialized knowledge to understand. 

 

The Comments column is egdbhealth’s best effort to make the detail digestible and actionable with incomplete knowledge of the domain.  In some cases, the Comments column will provide links to Internet resources that offer more information to support a deeper understanding.

 

Here is another example that identifies tables that have geometry columns that do not have spatial indices:

 

Critical: missing spatial indices

 

The absence of spatial indices on geometry columns will degrade the performance of spatial queries.  In this case, the “Comment” column recommends that spatial indices be created (or rebuilt) to heal the problem.

 

In this next example, the problem is that the release of the eGDB is quite old, indicating that it should be upgraded.  Note that the “Comments” column provides a link to more information (online):

 

Critical: egdb release support will expire soon

 

Warnings

“Warnings” follow the same pattern as “Criticals”.  However, as the name implies, they are a lower priority for review.  Note that a given evaluation may have both critical and warning findings.

 

In the example below, egdbhealth is reporting that there are stale or missing statistics on a variety of objects in the eGDB:

 

Warning: stale and missing RDBMS statistics

 

Depending on the details of the specific statistics, the finding is flagged as “Warning” or “Critical” in the Comment column (always at the far right). 

 

Here, in cases where no statistics information is available, the record is treated as a “Warning” because of the uncertainty.  Statistics that have information indicating that they have not been updated recently, or there have been a lot of changes since the last update, are flagged as “Critical”. 

 

The RDBMS’ cost-based optimizer uses these statistics to determine the best query execution plans.  Thus, if the statistics are not current with respect to the state of the data, the optimizer may not make good choices and the performance of the system will be sub-optimal.  

 

In the example below, most of the records are “Informationals”, simply reporting facts about the system.  But, there are a few rows that have “Warnings”. 

 

Warning: non-RDBMS processor utilization

 

The Warnings are noting that, for a short period of time, the machine running SQL Server had more than 10% of its processor capacity used by a process other than SQL Server itself.  This is not a condition that causes a performance or scalability problem.  However, as most RDBMS systems are intended to run on dedicated machines, this may be an indication that there are other processes that do not belong or need special attention in the administration of the system. 

 

Informationals

“Informationals” follow the same pattern as the other types of findings.  However, the information is not evaluative in nature.  As it is essentially descriptive, it could be placed in the Context Excel file.  There are a few reasons why it is in the Expert file instead:

 

  1. The findings may not always be Informational … depending on the conditions encountered.
  2. The information is relatively volatile (i.e. changes over time). The Context Excel file is designed to provide information that is relatively static in nature.

 

The example below illustrates this first case:

 

Informational: egdb license expiration

 

The licensing of this eGDB will not be a concern for many months.  But, in about six months, if the license has not been updated, this message will no longer be informational.

 

Similarly, the finding below about the underlying database file sizes could change at any time:

 

Informational: database file sizes

 

Thus, these descriptive pieces of information are reported in the Expert Excel, even though they are not currently reporting an evaluative finding that is negative.

 

Taking Action to Improve Health

Just as it is impractical to describe all of the individual evaluations in this document, it is impractical to provide action instructions for each one.  Instead, this article discusses the process of understanding and acting on the evaluative information, along with specific examples.

 

The process involves the following steps:

 

  1. Understand the evaluation
  2. Validate the evaluation
  3. Try to resolve the evaluation
  4. Validate the resolution

 

Understand the Evaluation

Some evaluations are easier to understand than others.  In those fortunate cases where the “comments” column adequately communicates the concern, this step happens automatically.  In othere cases, some research may be appropriate.

 

For example, the findings below report that Checkdb has never been run on the databases in this SQL Server instance (it flags the eGDB as critical, whereas the other databases are warnings):

 

Checkdb warnings

 

If you are not already familiar with Checkdb, an Internet search for “SQL Server Checkdb” will return results to help you understand.  In many cases, a modest research effort such as this will be all that is necessary to understand an evaluation which is in a topic that is unfamiliar to you.

 

In this case, an Internet search would likely surface the following links, offering more information and suggested actions: https://docs.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-checkdb-transact-sql?view=sql-server-2017, https://www.mssqltips.com/sqlservertip/4381/sql-server-dbcc-checkdb-overview/, and https://www.brentozar.com/blitz/dbcc-checkdb-not-run-recently/.  In short, Checkdb runs a variety of internal checks on the database to identify possible corruption and other issues.  So, it is good to run it once in a while to avoid such problems.

 

Validate the Evaluations

It is useful to validate evaluations before taking action because, for a variety of reasons, the information returned may have imperfections or require some judgement. 

 

For example, in the “Instance” category below, there are 2 “Critical” Memory Pressure Warnings evaluations, but the Memory Pressure evaluation is only reporting “Informationals”, not “Warnings” or “Criticals”.

 

Various memory pressure indicators

In this case, the situation is explained by the fact that there many different indicators of memory pressure.  At any given time, and over time, they do not necessarily all point to the same conclusion.  Thus, you must weigh the related information before concluding that action is warranted (and what action is warranted).

 

In other cases, the evaluations may benefit from your judgement about the detailed information provided in the findings sheet.  For example, this detail about “Long Elapsed Time Queries” has surfaced that there are some queries that spend very long time in SQL Server.

 

Queries with long elapsed times

 

In the first row, there is a query which has an average duration of 72 seconds (third column).  However, it has only be executed 6 times in the period for which these statistics support. 

 

Egdbhealth does not know the period of the statistics (perhaps they were just flushed a few moments ago).  And, egdbhealth does not know if 6 executions is a lot or a little.  Here, it is more than other queries, but it is not many in absolute terms.  Finally, egdbhealth does not really know what “slow” is for this particular query.  Perhaps this supports a “batch” process that is expected to take a long time.  To make this determination, you would scroll over to the right (not in this screen capture) to view the SQL statement to see what the query is doing.  Then, you can make an informed judgement, based on how your system is used, and the reasonable expectations that users have for its performance, about whether or not these queries with “long elapsed times” are ones that should be actionable for you.

 

Try to Resolve the Evaluation

Your understanding of the evaluation will guide your efforts to address the problem.  In some cases, such as the one below, egdbhealth will point to Internet-based resources that will help you plan and carry-out the actions.

 

Some comments provide hyperlinks to additional information

 

In this case, egdbhealth recognized that the SQL Server instance is running on virtual hardware.  In the case of VMWare (and perhaps other platforms), best practice advice suggests that the minimum server memory and maximum should be set to the same value.  Once you understand it, this change is relatively straight-forward to make and may require only a brief consultation with the virtual machine platform team to confirm that it corresponds with best practices in their minds also.

 

In other cases, egdbhealth’s guidance will be more oblique and you will need to rely upon specialists within your organization, Esri Technical Support, or your own Internet research to come up with an action plan.  

 

Sometimes actions will involve changes that will take a considerable amount of organizational and/or system change.  In the example below, egdbhealth is suggesting that the performance of the versioning system could be improved by having less hierarchy in the version tree.  Changing the way versioning is used by an organization is a major undertaking that requires planning and time.  In this case, you can expect to spend time planning changes, socializing them within your organization, and then carrying it out.

 

Version tree hierarchy refactoring advice

 

Validate the Resolution

Running egdbhealth again, after your initial efforts to resolve the evaluation(s) will effectively validate whether or not your efforts succeeded.  Note that, when you run egdbhealth again on the same eGDB, the prior Expert Excel file is placed in the “archive” subdirectory for your reference.  (The Content Excel file is not re-created, because its information is less volatile.)

 

Naturally, you hope to find all of the “Criticals” or “Warnings” that you addressed have disappeared in the new Expert Excel output.  And, this can be expected where you have correctly understood the problem and taken effective action.

 

For example, a finding such as the one below (that the most recent compress failed) will be resolved in the “OVERVIEW_EXPERT” sheet as soon as you address the problem.  In this case, as soon as you successfully compress and re-run egdbhealth, this evaluation will be resolved.

 

Failure of recent compress

 

In a few cases, however, the “Critical” or “Warning” classifications will not fully resolve themselves even though the current condition is no longer the same.   For example, the “Compress During Business Hours” evaluation reports on the recent history of compresses, not just the most recent compress.  You can expect the evaluations to remain unchanged in the “OVERVIEW_EXPERT” sheet for some time. 

 

History of compresses during business hours

 

The detail sheet and other sheets in the Versioning category will illustrate that your recent compress did not occur during business hours (if that is the case).  Thus, you have resolved the evaluation.  And, over time, egdbhealth will allow itself to agree.

 

Finally, you will find that some evaluations are volatile.  In repeated runs of egdbhealth, they will seem to be present or absent without relationship to your specific actions.  For example, the evaluation below reports on the percentage of base table records that are in the delta tables (“A” and “D” tables).  Where those percentages are high, it offers a negative evaluation.

 

Base and delta table record counts

 

The action you may have taken in response is to compress the eGDB.  The effectiveness of that action, however, would depend upon the reconciling and posting that is occurring on the system.  So, if there had been no new reconcile and post activity, the compress would not have changed the evaluation.  On the other hand, if there had been reconcile and post activity, or if a very stale version had been deleted, the compress may have resolved many of the findings.  It is also true, however, that even with the ideal reconciles, posts, and compresses, editors might be generating more updates which are populating the delta tables at the same time as you are de-populating them.

 

The “Memory Metrics” example discussed earlier in this article are another case where you can expect volatility in evaluations.  This is because memory pressure indicators will be triggered by different conditions in the database.  Your informed judgment will be required to determine whether the recurring evaluations indicate a problem that needs further action.

 

The point is that the goal of taking action is not necessarily to achieve a “clean report card” with no negative evaluations.  The goal should be to have only the evaluations that are appropriate to your system.  In the process, you will have deepened your understanding of your eGDB system and offered many tangible improvements to the users of that eGDB.

 

Summary

The primary purpose of egdbhealth is to help administrators understand and improve the health characteristics of eGDBes.  Focusing on the Expert Excel file output, and prioritizing your analysis based on the Critical/Warning/Informational classification scheme, you can address the aspects of an eGDB which are most in need of investigation.  Some of the evaluations offered by egdbhealth may require various kinds of research to understand and determine a course of action.  Colleagues, Esri Technical Support, and Internet resources can be used to build your knowledge.  When you do take action to improve the health of your eGDB, be sure to run egdbhealth again to validate and document your progress.

 

I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

These eight videos cover the GIS Manager Track sessions from the 2018 Esri International User Conference presented July 11-12, 2018 in San Diego, CA.  They are:

 

  • Enterprise GIS: Strategic Planning for Success
  • Communicating the Value of GIS
  • Architecting the ArcGIS Platform: Best Practices
  • Increase GIS Adoption by Integrating Change Management
  • Governance for GIS
  • Moving Beyond Anecdotal GIS Success: An ROI Conversation
  • Workforce Development Planning: A People Strategy for Organizations
  • Supporting Government Transformation & Innovation

A special thanks to those that helped present these sessions: Clinton Johnson Michael Green Matthew Lewin Wade Kloos Justin Kruizenga Andrew Sharer Eric_Apple

 

https://www.youtube.com/playlist?list=PLaPDDLTCmy4auFPPuXEzGYkQUQi8AG_uh

I'm proud to announce the agenda for the 2019 GIS Managers' Open Summit (GISMOS) at the Esri International User Conference on Tuesday, July 9th.  This is the 10th annual GISMOS and looks to be one of the best.  A huge shout out to the presenters: Eric John Abrams, Brandi Rank and Marvin Davis.

 

If you are headed to the User Conference and want to learn proven, real-world strategies focusing on the people/culture/business side of GIS, then please consider attending & participating in GISMOS.  Attendees need to register, but there is no additional cost.  I hope to see you there!

 

UPDATE - 6/24/2019 - Added link to GIS Manager Track, updated schedule times, added Project Management in GIS Special Interest Group, and added Cobb Co., GA Executive/Elected Official Panel Discussion.

 

UPDATE - 6/12/2019 - The location of GISMOS has changed.  It is now in the Indigo Ballroom in the Hilton Bayfront at 1 Park Blvd., San Diego, CA 92101 (adjacent to the San Diego Convention Center). Please see attached updated agenda.

 

esriuc gis manager gis strategy gis leadership

dkrouk-esristaff

What is Egdbhealth?

Posted by dkrouk-esristaff Employee Apr 26, 2019

 

Note: This blog post is the first in a series of three planned posts about egdbhealth.  The second in the series will address how to use the tool to evaluate the health of an Enterprise Geodatabase.  The third in the series will address using egdbhealth in a system design context.

Introduction

Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  It provides descriptive information about the content of the eGDB and it provides evaluative information about the eGDB.  The evaluative information is the primary purpose of the tool; to surface existing or latent problems/challenges with the eGDB.  The tool works with eGDBes backed by Oracle, PostgreSQL, and SQL Server.

 

Installation and Execution

Although it is not an extension to ArcGIS Monitor, egdbhealth is available for download from the ArcGIS Monitor Gallery: https://www.arcgis.com/home/item.html?id=ea4bbf9b46084dc49efae9889832aa22

 

Installation and Pre-Requisites

To install, simply download the .zip archive and extract the files to a directory of your choosing. 

There are some pre-requisites for running egdbhealth.  Those considerations are discussed in the documentation in the .zip archive (egdbhealth_README.docx).  But, at a high level, the pre-requisites are:

 

  1. You need a database client for the type of database (Oracle, PostgreSQL, or SQL Server) to which you are connecting (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/database-clients.htm).
  2. If you are using “Operating System Authentication” with SQL Server, and your eGDB is owned by “dbo” (instead of “sde”), you need to run egdbhealth as a Windows user that is dbo.  In other words, you must connect as the “Geodatabase Administrator” (http://desktop.arcgis.com/en/arcmap/latest/manage-data/gdbs-in-sql-server/geodatabase-administrator-sqlserver.htm).  You must connect as the Geodatabase Administrator with Oracle or PostgreSQL, but that usually means connecting as the user called “sde”.
  3. If you are connecting to Oracle or PostgreSQL, you must have enabled the ST_Geometry type library (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-an-oracle-database.htm or http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-a-postgresql-database.htm).

 

Execution

Double-clicking egdbhealth.exe will launch a command window and a Windows form. 

 

Follow the prompts on the form to fill out the connection information for your database.  Tooltips provide hints about the nature of the information required.  But, in principle, the information is the same as what you would provide to ArcGIS to connect to the database as the “Geodatabase Administrator.

 

Graphical User Interface

 

The “Test” button will attempt to confirm that you can connect, with the required privileges, to the eGDB.  If the test is successful, the “Test” button will become a “Run” button.  Clicking that will close the form and begin executing the tool in the command window.

 

When the tool completes, it will open the “output” subdirectory which will contain five output files:

 

  1. An HTML file containing metadata about the queried eGDB and RDBMS target and how the connection is made.
  2. A “Context” Excel file that contains descriptive information about the eGDB and RDBMS target
  3. An “Expert” Excel file that contains evaluative information, classified as “Critical”, “Warning”, and “Information”.
  4. A png file that depicts the version tree.
  5. A png file that depicts the state tree.

 

Introducing the Output

An example of the output files is shown below:

 

Output files

 

The files will bear the “GDB Friendly Name” that you specified in the form (in this example, “SQL_GIS”). 

 

The Expert Excel File

The Expert Excel file is the main information artifact.  It provides the evaluative information about the target eGDB system.  For example, if the eGDB has not been compressed recently, there will be an evaluation that reports this as a concern.

 

The file has a summary sheet which provides an overview of the evaluations.  The evaluations are categorized by general topic (first column, “Category”) and classified (Critical, Warning, and Informational columns) such that you may prioritize your review of the information. 

 

The Description column briefly explains the purpose of the evaluation.  Where there is a red marker in the Description cell, there is a hover tip that provides yet more information about what is being evaluated.

 

Expert Excel file Overview sheet

 

Some findings have no records to report.  For example, if there is no problem with a given kind of Geodatabase Consistency, there will be no records.  However, many findings will have some number of records of various classifications. In those cases, the Name (second column) will have a hyperlink to the sheet in the workbook that has the detailed finding records.

 

The Context Excel File

The Context Excel file is similar in structure but lacks the expert (Critical-Warning-Informational) classification columns.  The information is more descriptive and less evaluative.  For example, in this file, you can find a listing of all of Geodatabase Domains and the ObjectClasses to which they are related.  That information is neither good nor bad (i.e. not evaluative).  But, it may be useful to know.

 

Context Excel file Overview sheet

 

Records that are highlighted in green have SQL statements that you can run for additional information, as appropriate.  Usually, these SQL statements are too expensive to run on all of the content in the eGDB.  But, someone familiar with the eGDB and the issues it has may have ideas about which queries would be useful to run nonetheless.  For example, the “SqlGeomTypeSizeSql” sheet has a SQL query for each FeatureClass in the eGDB.  If you run one of these queries it will report the sizes of the geometries in one FeatureClass.  This is an expensive enough operation that it would not be appropriate to run it on all of the FeatureClasses by default.  But, if there is a FeatureClass that has a performance problem, it may be useful for you to run the query for that FeatureClass to examine the sizes of its geometries.

 

The PNGes

The PNG files (a version tree diagram and a state tree diagram) are typically of interest if your eGDB has data that has been registered as versioned.

 

The version tree graph illustrates the version tree hierarchy.  Color coding (red-yellow-green) indicates the relative degree of “staleness”, or how long it has been since the version has been edited or reconciled.

 

Version tree

 

The state tree schematic illustrates the depth and structure of the state tree (which is the detailed structure upon which the version tree relies).  The Default version is shown in red and “State Zero” is shown in green.  The further these nodes are separated, the more expensive it is for the database to return information about the Default version (the most commonly used version in most systems).

 

State tree

 

The HTML

The HTML provides some general information about the eGDB, RDBMS, and machine that is the target of the evaluation.  It may only be of passing interest in many cases.

 

HTML metadata

 

Summary

This article as described the purpose of egdbhealth, how to run it, and what its outputs are.  As the outputs of the tool contain quite a bit of technical information, other articles will address how to use the outputs for (a) understanding and improving eGDB health (b) designing GIS systems.

 

I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

jdeweese-esristaff

What is VDI Anyway?

Posted by jdeweese-esristaff Employee Apr 25, 2019

Often the term "VDI" is used to define ArcGIS Desktop/ArcGIS Pro deployed as a virtual application. The challenge is understanding what specific virtualization technology is actually being referenced when using this term since VDI, or "Virtual Desktop Infrastructure", represents just one of several desktop virtualization options. So, the intent of this article is define the options and differentiate what VDI truly means.

 

ArcGIS Desktop has been delivered virtually for over 20 years using what is referred to as "hosted virtual applications" which includes technologies such as Citrix XenApp (recently renamed to Virtual Apps) and Microsoft Remote Desktop Services (RDS). This approach is referred to as "hosted"  because it is being hosted by a singular operating system which users share by initiating individual user sessions. This technology option represents a many-to-one relationship in terms of users and virtual machines. Further, the shared operating system is a server OS, such as Windows 2016 and not a desktop OS, such as Windows 10. Hosted virtual applications provides a means to share a singular server with multiple users and is an attractive option since since each user doesn't require their own dedicated virtual machine. For this approach, system resources are shared including processors, memory, and GPU and there isn't a practical way to assign resources at the individual user session level.

 

A more recent innovation is to provide individual virtual machines to users as "virtual desktops" where each user accesses a remote desktop deployed with a desktop operating system such as Windows 10. This includes technologies such as Citrix XenDesktop (recently renamed to Virtual Desktops) and VMware Horizon. This approach represents the true meaning of "VDI" as it is defined by a one-to-one relationship between users and virtual machines. Though this approach increases per-user deployment costs, it also provides a more isolated deployment in terms of resources since processors, memory, and GPU resources can be assigned accordingly. The ability to manage GPU resources for the virtual desktops has made this approach an attractive option for ArcGIS Pro which requires a GPU.

 

So, the next time you hear the term "VDI" used for delivering ArcGIS to users, know that this implies that each user is being presented with their own individual Windows desktop virtual machine with a set of assigned resources as opposed to users accessing a singular server-based virtual machine and sharing it with multiple users, including sharing the server's assigned system resources. 

Amazon and Esri recently published a whitepaper outlining the steps needed to setup and configure Amazon AppStream 2.0 and ArcGIS Pro. 

 

Through testing, Esri and AWS outline the various classes of AppStream hosts:

 

ArcGIS 2D Workloads – stream.compute.large, stream.memory.large. Compute and Memory optimized instances are perfectly suited for ArcGIS Pro workloads that does not require a GPU.

 

ArcGIS 3D Workloads (Normal) – stream.graphics-design.xlarge. Graphics Design instances are ideal for delivering applications such as ArcGIS Pro that rely on hardware acceleration of DirectX, OpenGL, or OpenCL. Powered by AMD FirePro S7150x2 Server GPUs and equipped with AMD Multi user GPU technology, instances start from 2 vCPU, 7.5 GiB system memory, and 1 GiB graphics memory, to 16 vCPUs, 61 GiB system memory, and 8 GiB graphics memory.

 

ArcGIS 3D Workloads (High res) – stream.graphics-design.2xlarge or stream.graphics-pro.4xlarge. The Graphics Pro instance family offers three different instance types to support the most demanding graphics applications. Powered by NVIDIA Tesla M60 GPUs with 2048 parallel 4 processing cores, there are three Graphics Pro instances types starting from 16 vCPUs, 122 GiB system memory, and 8 GiB graphics memory, to 64 vCPUs, 488 GiB system memory, and 32 GiB graphics memory. These instance types are ideal for graphic workloads that need a massive amount of parallel processing power for 3D rendering, visualization, and video encoding, including applications such as ArcGIS Pro.

 

Please find the full whitepaper here: https://d1.awsstatic.com/product-marketing/AppStream2.0/Amazon%20AppStream%202.0%20ESRI%20ArcGIS%20Pro%20Deployment%20Gu…   

What is System Log Parser?

System Log Parser is an ArcGIS for Server (10.1+) log query and analyzer tool to help you quickly quantify the "GIS" in your deployment. When run, it connects to an ArcGIS for Server instance on port 6080/6443/443 as a publisher (or an administrator), retrieves the logs from a time duration (specified as an input), analyzes the information then produces a spreadsheet version of the data that summarizes the service statistics. The command-line version of System Log Parser (slp.exe) is used by ArcGIS Monitor for data capture.

System Log Parser supports the following service types:

  • Feature Services
  • Geoprocessing Services
  • Network Analyst Services
  • Geocode Services
  • KML Services
  • Stream Services
  • GeoData Services
  • Map Services
  • Workflow Manager Services
  • Geometry Services
  • Image Services

 

  • Globe Services
  • Mobile Services

 

System Log Parser (https://arcg.is/0XLnfb), a free-standing application or Add-on for ArcGIS Monitor, is an effective tool for diagnosing and reviewing infrastructure functionality.

 

Getting Started

 

In this section, we’ll configure ArcGIS Server to collect logs at the level needed for the tool and setup System Log Parser to generate a report (MS Excel).

1.   Ensure the following conditions are met on the machine you’ll be running System Log Parser from:

  1. 64-bit Operating System:
    1. Windows 7 (64 bit), Windows 8.x, Windows 10
    2. Windows Server 2008 64 bit, Windows Server 2012, Windows Server 2016
  2. RAM: 4 GB
  3. Microsoft .NET Framework 4.5 or 4.6
  4. Microsoft Excel 2010 or newer (or appropriate .xlsx viewer).

2.   Set your ArcGIS Server logs to Fine on EACH server you’d like to get metrics on. Complete instructions on how to       change ArcGIS Server log levels can be found here:  Specify Server Log Settings

Note:    I recommend running the logging at FINE for AT LEAST one week prior to running System Log              Parser. This should give you a fairly clear picture of a typical weeks load.

3.   Download and extract System Log Parser here: https://arcg.is/0XLnfb

4.   Extract the .zip file.

Note:    This is BOTH the user interface and the Add-on for ArcGIS Monitor.  We will be focused on the user               interface version for this exercise.

5.   Launch System Log Parser

6.   Browse to the location you extracted System Log Parser

7.   In the System Log Parser for ArcGIS folder, locate and launch SystemLogsGUI.exe

System Log Parser GUI

Note:    You may be prompted that Windows has protected your PC.  If you do get this prompt, please click              More info and then click Run Anyway.


 

Configuring System Log Parser

 

The following outlines the configuration required to setup System Log Parser to analyse a weeks worth of logs.

Note:    The System Log Parser will automatically access logging for all clusters that are part of an ArcGIS              Server Site. If you have multiple ArcGIS Server Sites configured

Click the ArcGIS Server (Web) button to display the following:

Fill out the above form as indicated below:

1.   Enter the Server URL.

  1. The typical syntax with ArcGIS Server 10.2 or higher is: https://<host_name>:<port_number>/arcgis  
  2. The typical syntax with ArcGIS Server 10.1 is: https://<host_name>:<port_number>/ArcGIS
Note:    If your URL structure is different, enter it.

2.   Enter the ArcGIS Server Manager user name with publisher or better permissions. 

3.   Enter the users password

4.   Check this box if you are accessing a Site federated to Portal for ArcGIS

Note:   Consider using a web adapter address for the Server URL:  https://<webadaptor_name>/server
Note:   If accessing over the internet, this assumes that the web adapter was registered with administrative access to ArcGIS Server

5.   Check this box if you use IWA(Integrated Windows Authentication)

6.   If needed, specify a token(advanced option)

7.   Select an End Time (Now)

8.   Select Start Time (1 week)

9.   Select Analysis Type (Complete)

  1. Simple: Provides only the Service Summary page data. 

    Note: This mode will also generate a list of the underlying data source by service and layer in the service. 

  2. WithOverviewCharts: Provides the Service Summary page plus charts of Request Count, Average Request Response Time, and Max Request Response Time.

  3. Complete: Provides Service Summary page plus all data and charts in separate tabs for all services.

  4. ErrorsOnly: Provides a report of just the errors.
  5. VerboseMode: Provides full verbose log analysis (Limited to 12 hours).

10.   Select Report Type (Spreadsheet)

11.   Specify where to output the report (Default is your My Documents location)

 

Click Analyze Logs. Analyze Logs

This process can take a few minutes or longer, this all depends on the number of transactions logged.

Review the System Log Parser report

 

When System Log Parser finishes running, it will open the report in Excel if present.  If you ran this from a machine without Microsoft Excel, move it to a computer with Excel and open.

 

You will note that there is a summary tab, and several tabs listed across the bottom of the spreadsheet.  We'll cover each in further detail below, by tab.

 

Summary

When the Excel report opens, you will see the Summary tab. The below screen grab shows what server this was run against and some summary statistics.

 

Summary

 

Statistics

On the bottom of the Excel report select the Statistics tab to view a table of all services by layer and service types.  this is where we'll spend most of our time.  Please read the rest of this post, then click here.

 

Resources

On the bottom of the Excel report select the Resources tab to view several charts:

  • Top 20 Resources by Count
  • Top 20 Resources by Average Response Time
  • Top 20 Resources by Maximum Response Time

 

Methods

On the bottom of the Excel report select the Methods tab to view several charts:

  • Top 20 Methods by Count
  • Top 20 Methods by Average Response Time
  • Top 20 Methods by Maximum Response Time

 

Queue Time

On the bottom of the Excel report select the Queue Time tab to view any services that had to wait for a ArcSOC to return a result. In an ideal setting the below is the desired value:

 

Queue Time Stats

 

Users

On the bottom of the Excel report select the Users tab to view a chart of the top 20 users by request count.

 

Time

On the bottom of the Excel report select the Time tab to view a chart of requests by day.

 

Throughput per Minute

On the bottom of the Excel report select the Throughput per Minute tab to few a minute by minute breakdown of requests.

Below is a sample of what information can be found on the tab:

 

Throughput Per Minute

 

Elapsed Time of All Resources

On the bottom of the Excel report, select the Elapsed Time of All Resources tab to view chronological listing of all requests from the time period the System Log Parser report was generated.

 

I'd also like to thank Aaron Lopez for his help and continued development of this invaluable tool. 

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

 

I hope you find this helpful, do not hesitate to post your questions here: ArcGIS Architecture Series: Tools of an Architect

What is System Log Parser?

System Log Parser is an ArcGIS for Server (10.1+) log query and analyzer tool to help you quickly quantify the "GIS" in your deployment. When run, it connects to an ArcGIS for Server instance on port 6080/6443/443 as a publisher (or an administrator), retrieves the logs from a time duration (specified as an input), analyzes the information then produces a spreadsheet version of the data that summarizes the service statistics. The command-line version of System Log Parser (slp.exe) is used by ArcGIS Monitor for data capture.

 

Note:   This post is a second in a series on System Log Parser, please see ArcGIS Server Tuning and Optimization with System Log Parser to learn how to setup your server for System Log Parser and an overview of the report.

Introduction to Statistics Used In System Log Parser

 

There are several statistical categories you should be familiar with when using System Log Parser. (definitions from Wikipedia)

 

Percentile (P) - a measure used in statistics indicating the value below which a given percentage of observations in a group of observations falls. For example, the 20th percentile is the value (or score) below which 20% of the observations may be found. 

 

Average (avg) -   is a single number taken as representative of a list of numbers. Different concepts of average are used in different contexts. Often "average" refers to the arithmetic mean, the sum of the numbers divided by how many numbers are being averaged. In statistics, mean, median, and mode are all known as measures of central tendency, and in colloquial usage any of these might be called an average value. 

 

Maximum (Max) -   [L]argest value of the function within a given range.

 

Minimum (Min) -   [S]mallest value of the function within a given range.

 

Standard Deviation (Stdev) -   [A] measure that is used to quantify the amount of variation or dispersion of a set of data values. A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.

 

Fields of the Statistics Collected

 

Field
Definition
Resource Requested resource or service (Service REST endpoint)
Capability The ArcGIS capability of the resource
Method The function performed by the resource (What was accessed)
CountThe number of requests for this resource
Count Pct Count percentage based on total service requests
Avg The average time (in seconds) spent processing request
MinThe time (in seconds) of the shortest request
P5, P25, P50, P75The percentile grouping of the time (in seconds)
P9595% of all responses occur between 0 seconds and the value displayed in this column per service
P9999% of all responses occur between 0 seconds and the value displayed in this column per service
MaxThe time (in seconds) of the longest request
StdevThe standard deviation of time (in seconds)
SumThe total time (in seconds) spent processing requests per resource
Sum PctThe total time (in seconds) spent processing requests

 

We're going to focus on 2 key statistics, P95 and Max.  As we learned above, P95 signifies the response time for the fastest 95% of all requests and Max signifies the maximum draw time per request per service and method.

 

Identifying Opportunities to Tune Service Performance

 

In the example below, I've sorted P95 and Max values over 1/2 second.  User experience drops the longer your draw-time takes. 

 

I've highlighted any Max draw time over 1/2 second in red and any P95 draw time over 1/2 second in yellow.  These are the services and layers I'd focus on cleaning up, focusing first on getting the P95 value below 1/2 second first. 

In the next section you'll find starting points to tune and optimize your services.

 

Another column worth reviewing is the Sum Pct.  this column factors in the number of requests for each service and the respective average time, then weights that in against all the other services.

 

Sum Pct

 

For example:   

  1. One service may have thousands of more requests than all others but it has fast times (Sum Pct should be low)
  2. Another service may have just a small handful of requests but very slow times (Sum Pct should be high). In this case, this service would be a good candidate to for tuning.

 

Best Practices for Services

 

Below are some links to get you started on service tuning and SOC management.

         

In addition to the above, data source performance should be looked at if adjustments to the service do not help enough. You can look at:

 

I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

If you're headed to the Esri Federal GIS Conference next week, and are a current, or future, leader, please consider attending this session that Gerry Clancy and I will be presenting on Wed. Jan. 30 from 5:15-6:15 PM in Room 209C:

 

GIS for Leaders: Seven Elements of a Successful Enterprise GIS

 

It takes more than technology for an enterprise GIS to be successful. It requires business and IT management skills. This session will review the seven elements of a successful enterprise GIS and provide strategies how GIS Managers can implement them. The seven elements are:

  • Vision and Leadership
  • Understand how GIS can contribute to your organization’s success
  • Develop and maintain a GIS Strategic Plan
  • Implement effective governance
  • Implement evolutionary approaches (change management)
  • Deploy engaging apps
  • Recruit, develop and maintain good staff

 

https://fedgisdevdc2019.schedule.esri.com/schedule/1801978386