Skip navigation
All Places > Implementing ArcGIS > Blog > 2019 > May

Note: This blog post is the third in the series of three planned posts about egdbhealth.  The first described introduced the tool, how to install it, and how to run it.  The second in the series addressed how to use the tool to evaluate the health of an Enterprise Geodatabase; its primary purpose.  This article addresses using egdbhealth in a system design context.


Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  The primary purpose of the tool is to evaluate the “health” of eGDBes.  However, the output can also be used in a system design context.  This article addresses the system design use case. 


For information about installing and running the tool, please refer to the first blog post in this series "What is Egdbhealth?"


Information Objectives for System Design

The Esri system design practice focuses on planning the hardware, software, and network characteristics for the future state of systems based on new or changing requirements. 


The current health of an existing system will not necessarily have a strong relationship to a future system that has different requirements.  However, depending on the design objectives, information about the current system can be relevant. 


For example, in the case of a planned migration from an on-premises system to a cloud platform, it would be quite useful to describe the current system such that it can be faithfully rendered on a cloud platform.  Or, requirements driving a design may indicate a need to exchange (replicate, move, synchronize, transform, copy, etc.) large portions of a Geodatabase to another repository.  In that case, an inventory of the Geodatabase content can be useful to establish important details about the nature and quantity of data that would need to be exchanged and the optimal ways for that data exchange to occur. 


Thus, at a high level, it is the primarily the descriptive information from egdbhealth, rather than the evaluative information, which is pertinent to system design.


This article discusses examples for a SQL Server backed eGDB.  However, the principles are the same for eGDBes that are backed by Oracle and PostgreSQL.


Machine Resources and Utilization

For system design cases where the eGDB machine platform will change, it can be useful to understand the current machine resources that support the eGDB and their degree of utilization.  For example, if you are migrating an eGDB to a cloud platform, the number of processor cores that the system has on premises has some relevance to the number you might deploy on the cloud.


The Machine

The HTML metadata file produced by egdbhealth will provide resource information about the machine platform on which the RDBMS runs. 


In the example below, the SQL Server instance runs on a machine with one processor which presents 8 logical processor cores. 


Machine resources in HTML metadata


This information from SQL Server leaves some information unclear.  For example, SQL Server is not able to report on whether the logical cores are a result of hyperthreading.  If they are, then the physical cores would be half of the logical cores (4 physical cores).  And, it is the number of physical cores that is most useful for capacity planning in system design.  So, this information is good but imperfect.


The HTML also reports the physical memory on the machine (16GiB) and the minimum and maximum memory configured for the SQL Server instance (0 and 3GB).  Here we see that the configuration of the SQL Server instance is quite relevant to understanding the relevant memory resources for capacity planning.  Although the machine has 16GiB of memory, this SQL Server instance (and the eGDB it supports) does not have access to all of that memory. 


The Utilization

The characteristics of the machine, and the configuration of the instance, offers incomplete insight into the degree to which the machine resources are utilized and what resources are truly needed for the current workload. 

We can be certain that SQL Server does not use more than 3GB of memory.  But, does it use it thoroughly?  And, what about the processor resources; are the busy or idle?


Much of the utilization information appears in the Expert Excel file because it is relatively volatile.



In the case of SQL Server, there is an evaluation in the Expert Excel file called “ProcessorUtil” (Processor Utilization; the Category is “Instance”).  It provides a sample of the recent per-minute processor utilization by SQL Server on the machine.  In this case, we can see that, for the sampling period, SQL Server uses almost none of the total processor resources of the machine on which it runs.


Processor utilization


From this, we infer that, if this period is typical, the existing machine has many more processor resources than it needs.  If all else is equal, the future system design can specify fewer processor cores to support the eGDB workload.  Naturally, for this inference to have any integrity, you must run the tool during typical and/or peak workload periods.


There is another evaluation, CpuByDb (Instance category), which reports which databases in the instance consume the most processor resources.  So, in the case that the instance did have significant processor utilization, this information could be used to determine whether that processor utilization related to the eGDB of interest or some other workload in the same SQL Server instance.


Processor utilization by database



In the case of SQL Server, there are several evaluations which provide insight into the memory utilization inside the instance.  Most of these are in the Instance category of the Expert Excel file.


MemoryByDb allows you to see the amount of memory (in this case, rows of data as opposed to SQL statements or other memory consumption) used by each database.


Memory (database pages) utilization by database


In this case, we see that the “GIS” eGDB is the primary memory consumer in the SQL Server instance.  So, it appears reasonable to understand that the eGDB can make use of the better part of the 3GB of instance memory.  But, is that too much or not enough for good performance and system health?


The MemoryMetrics evaluation reports on several memory metrics that can offer hints about how much in-demand the instance memory is.  In this case, we see that there are warnings that suggest that the existing memory available the instance may be too low for the workload.  Thus, we do not know if 16GiB of machine memory is too much or too little.  But, we do have reason to believe that 3GB of instance memory is too little.


Memory metrics


There are several other memory evaluations in the Expert Excel file.  It takes experience and judgement to integrate these observations into an informed judgement about the true memory needs of the system.   In this case, as in many cases, the inferences that one can make are incomplete.  But, they are better than nothing.



SQL Server does not provide complete information about the local storage on the machine on which it runs.  However, it does provide information about the size of the files in which the eGDB is stored.  Much of this information is available in the “Database” category of the Expert Excel file.


The FileSpace evaluation sheet reports on the files in which the eGDB is stored.  In this example, we see that there is a single data file (“ROWS”) which is about 6GB in size and a single transaction log file (“LOG”) which has used about 650MB in the past.  You can also see the maximum file limits established by the DBA, presumably being an indication of the maximum expected storage for the eGDB.


Database file sizes


These data are good indications of the amount of storage that would be needed for a “lift and shift” scenario where the entire eGDB will need to move to a different platform.


Data Inventory and Data Exchange

Another system design case is eGDB data needs to be exchanged with a different repository, perhaps another Geodatabase in a different data center.  In these cases, it may be relevant to establish the storage size and other characteristics of the data that will be exchanged, as opposed to the storage size of the entire eGDB.


Record Counts and Storage By ObjectClass

If you know the identity of the ObjectClasses to be exchanged there is information in the Content Excel file to report on the storage size and record counts of those item. 


In the Storage category, there is a sheet called “ObjClass” (ObjectClasses).  It reports, in considerable detail, the tables from which various eGDB ObjectClasses are composed, the number of rows that they contain, and the amount of storage that is allocated to each of them. 


In the example below, there are three ObjectClasses of interest (filtered in the first column): BUILDINGS_NV, COUNTRIES, and SSFITTING.  BUILDINGS_NV has about 4,400 records and consumes just over 1.25MB of space.  COUNTRIES is versioned, with 249 records in the base table, 66 adds, and 5 deletes (the TableName column calls out the “A” and “D” table names for versioned ObjectClasses).


ObjectClass records and storage


Thus, we can establish the record counts and approximate storage characteristics of ObjectClasses of interest in the eGDB.  At the same time, we can also see which ObjectClasses are registered as versioned.  This is critical information for establishing the range of options available for data exchange.


Versioning, Archiving, and Replication Status

In the same Context Excel file, the ObjClassOverview (Inventory category) provides an overview of the versioning, archiving, and replication statuses of all of the ObjectClasses in the eGDB.  This provides a more convenient mechanism for planning the range of data interchange mechanisms that would be appropriate for different ObjectClasses.


ObjectClass overview


Distributed Data Design Consideration

The ArcGIS documentation refers to this thematic area as “distributed data” and offers general guidance about the various strategies that can be used to move data around:


Versioned Data and Geodatabase Replication

If an ObjectClass is not already registered as versioned it may or may not be appropriate to do so for the purposes of making use of Geodatabase Replication.  This relates to how the data is updated.  Most data that is registered as versioned is updated “transactionally”, meaning that only a small percentage of the total records are changed at any given time.  This kind of updating is compatible both with versioning and Geodatabase Replication.  Other data is updated with methods like “truncate and add”, “delete and replace”, or “copy and paste”.  In these cases, most of the records are updated, physically (even if the majority of record values happen to be the same, the physical records have been replaced).  That kind of updating is not particularly well-suited to Geodatabase Replication.  The information in egdbhealth can tell you about the versioning and replication status of ObjectClasses.  But, it does not know how those ObjectClasses are actually updated.  Hopefully, those that are registered as versioned are transactionally updated and therefore suitable candidates for Geodatabase Replication.


Archiving Data and Data Exchange

If an ObjectClass has archiving enabled, there may be additional considerations.  For example, if the planned data exchange might increase the rate of updates, this could cause the quantity of archive information to grow in ways that are undesirable. 


For versioned ObjectClasses that are archive-enabled, the archive information is stored in a separate “history” table.  The size of that table does not impact the query performance of the versioned data.  In this case, the main design consideration is whether it is practical to persist and use all of the archiving information.  In many cases, it will be.  But, making the determination requires knowing the archiving status of the ObjectClass and the planned rate of updates.


In the case of non-versioned ObjectClasses that are archive-enabled (and branch-versioned ObjectClasses), the archive information is in the main table.  Thus, rapid growth of archive information can change the performance tuning needs of ObjectClasses and/or eGDB.  Egdbhealth cannot know the future update needs for the data.  However, it can report on the historical patterns of updates to non-versioned, archive-enabled ObjectClasses.  The Context Excel file contains a “NvArchEditDatesSql” sheet that contains SQL statements for each non-versioned, archive-enabled ObjectClass in the eGDB.  These statements are not run automatically because they are resource-intensive queries that do not provide information which is relevant to all circumstance. 


Additional SQL statements you can run


In this system design case, however, establishing the rate at which record have accumulated over time may be quite pertinent to planning a data exchange strategy.  Use a SQL client such as SQL Server Management Studio to run the queries for the ObjectClasses in question to get a sense of the rate of data update.  In this example, we can see a pattern for record updates where the typical update is small (1 record), but there was at least one occasion where there was a large update relative to the size of the ObjectClass (289 records on 8 April 2018):


Executed SQL statement


Attachment Size and Mobile Synchronization

Another design case which relates to distributed data is field-based data collection and attachments.  Collector for ArcGIS makes use of efficient mechanisms that are similar to Geodatabase Replication to synchronize “deltas” between client and server.  However, the ability to collect large quantities of data in attachments (such as images) and the often limited network conditions in which the data can be synchronized can raise system design issues.


The Expert Excel file has an evaluation “LargeAttachments” that will automatically bring cases of likely concern to your attention.  In the example below, the threshold values were not violated, so there was no negative evaluation:


There may be no records when there is no negative evaluation


However, it may still be useful to know something about the nature of the existing attachments.  The Context Excel file contains a sheet “AttachmentSizeSql” (“Inventory” category) that contains several SQL statements for each ObjectClass that has attachments. 


Continue to explore with additional SQL statements


Running these SQL statements in a client such as SQL Server Management Studio allows you to statistically characterize the attachments in various ways: aggregate statistics, top 100 by size, and percentile distribution. 


Additional SQL statements executed


With these statistics, you can understand not only the total amount of information, but whether it is due to a large number of small attachments or a small number of large attachments.  Large attachments might be a greater risk for synchronizing over poor network conditions and might lead you to make a recommendation that data collection operate in a disconnected mode and synchronization occur under specific network conditions only.


Attachment size considerations might also lead you to offer recommendations about whether and how it would be practical to handle two-way synchronization (as opposed to one-way).  While field-based users might have sufficient network circumstances to upload the attachments that they collect individually, two-way synchronization would mean that all of the attachments collected by other individuals (in the area of synchronization) would have to be downloaded.  In some cases, this could be many times the amount of information that would need to be uploaded.



As a designer of GIS systems, it is not practical for you to have profound expertise in all of the technology areas that pertain to the system.  For this reason, your understanding of egdbhealth outputs and what you can do with them has limits.  The purpose of this article is to identify some of the most common use cases for egdbhealth outputs for system design.  The main areas of interest are (1) machine sizing / resource utilization and (2) data characteristics relevant to data interchange.  Following the examples in this article, and generalizing from them for the Oracle and PostgreSQL databases, will allow you to extract useful information from the egdbhealth tool for your design work.


I hope you find this helpful, do not hesitate to post your questions here:


Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

Note: This blog post is the second in a series of three planned posts about egdbhealth.  The first in the series described what the tool is, how to install it, and how to execute it.  The third in the series will address using egdbhealth in a system design context.


Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  This article discusses how to use the outputs of egdbhealth to evaluate the health of an eGDB.  All of the examples use a SQL Server-backed eGDB.  However, similar principles apply to using the tool with Oracle- and PostgreSQL-backed eGDBes.


For information about installing and running the tool (i.e. creating the outputs), please refer to the first blog post in this series, "What is Egdbhealth?"


Viewing and Understanding Findings

The Expert Excel file contains an “OVERVIEW_EXPERT” sheet that allows you to see the evaluations at a high-level and prioritize your review of the findings.


Expert Excel file Overview sheet


This article will not describe all of the evaluations and their various meanings.  There are too many for that to be practical.  Instead, the article describes the process and provides specific examples to illustrate the kinds of benefits that can be gained.



The red-filled cells “Criticals” column should be viewed first.  These findings are highlighted as top concerns in their respective Categories. 


For example, in the screen capture above, “Backups” is flagged as a critical concern.  Click on the hyperlinked cell to view the sheet with the detailed information.


Critical: no backups exist


In this case, the worksheet has a single record that says that the database has never been backed-up.    This is a critical concern because if the system fails, data can be lost.  There is also a hyperlink back to the “OVERVIEW_EXPERT” worksheet.  This “Back to OVERVIEW” link appears in every detail worksheet to ease navigation.


In the example below, “Memory Pressure”, the detail worksheet displays memory pressure warnings reported by SQL Server.  When the RDBMS reports that there is memory pressure, it is an indication that there is, or soon will be, performance and/or scalability problems.


The Comments column (always found on the far right) describes the issue and the recommended course of action at a high level.  Note that the amount of information reported is much greater than the “BackUp” example (more columns) and that the information is of a highly technical nature, requiring specialized knowledge to understand. 


The Comments column is egdbhealth’s best effort to make the detail digestible and actionable with incomplete knowledge of the domain.  In some cases, the Comments column will provide links to Internet resources that offer more information to support a deeper understanding.


Here is another example that identifies tables that have geometry columns that do not have spatial indices:


Critical: missing spatial indices


The absence of spatial indices on geometry columns will degrade the performance of spatial queries.  In this case, the “Comment” column recommends that spatial indices be created (or rebuilt) to heal the problem.


In this next example, the problem is that the release of the eGDB is quite old, indicating that it should be upgraded.  Note that the “Comments” column provides a link to more information (online):


Critical: egdb release support will expire soon



“Warnings” follow the same pattern as “Criticals”.  However, as the name implies, they are a lower priority for review.  Note that a given evaluation may have both critical and warning findings.


In the example below, egdbhealth is reporting that there are stale or missing statistics on a variety of objects in the eGDB:


Warning: stale and missing RDBMS statistics


Depending on the details of the specific statistics, the finding is flagged as “Warning” or “Critical” in the Comment column (always at the far right). 


Here, in cases where no statistics information is available, the record is treated as a “Warning” because of the uncertainty.  Statistics that have information indicating that they have not been updated recently, or there have been a lot of changes since the last update, are flagged as “Critical”. 


The RDBMS’ cost-based optimizer uses these statistics to determine the best query execution plans.  Thus, if the statistics are not current with respect to the state of the data, the optimizer may not make good choices and the performance of the system will be sub-optimal.  


In the example below, most of the records are “Informationals”, simply reporting facts about the system.  But, there are a few rows that have “Warnings”. 


Warning: non-RDBMS processor utilization


The Warnings are noting that, for a short period of time, the machine running SQL Server had more than 10% of its processor capacity used by a process other than SQL Server itself.  This is not a condition that causes a performance or scalability problem.  However, as most RDBMS systems are intended to run on dedicated machines, this may be an indication that there are other processes that do not belong or need special attention in the administration of the system. 



“Informationals” follow the same pattern as the other types of findings.  However, the information is not evaluative in nature.  As it is essentially descriptive, it could be placed in the Context Excel file.  There are a few reasons why it is in the Expert file instead:


  1. The findings may not always be Informational … depending on the conditions encountered.
  2. The information is relatively volatile (i.e. changes over time). The Context Excel file is designed to provide information that is relatively static in nature.


The example below illustrates this first case:


Informational: egdb license expiration


The licensing of this eGDB will not be a concern for many months.  But, in about six months, if the license has not been updated, this message will no longer be informational.


Similarly, the finding below about the underlying database file sizes could change at any time:


Informational: database file sizes


Thus, these descriptive pieces of information are reported in the Expert Excel, even though they are not currently reporting an evaluative finding that is negative.


Taking Action to Improve Health

Just as it is impractical to describe all of the individual evaluations in this document, it is impractical to provide action instructions for each one.  Instead, this article discusses the process of understanding and acting on the evaluative information, along with specific examples.


The process involves the following steps:


  1. Understand the evaluation
  2. Validate the evaluation
  3. Try to resolve the evaluation
  4. Validate the resolution


Understand the Evaluation

Some evaluations are easier to understand than others.  In those fortunate cases where the “comments” column adequately communicates the concern, this step happens automatically.  In othere cases, some research may be appropriate.


For example, the findings below report that Checkdb has never been run on the databases in this SQL Server instance (it flags the eGDB as critical, whereas the other databases are warnings):


Checkdb warnings


If you are not already familiar with Checkdb, an Internet search for “SQL Server Checkdb” will return results to help you understand.  In many cases, a modest research effort such as this will be all that is necessary to understand an evaluation which is in a topic that is unfamiliar to you.


In this case, an Internet search would likely surface the following links, offering more information and suggested actions:,, and  In short, Checkdb runs a variety of internal checks on the database to identify possible corruption and other issues.  So, it is good to run it once in a while to avoid such problems.


Validate the Evaluations

It is useful to validate evaluations before taking action because, for a variety of reasons, the information returned may have imperfections or require some judgement. 


For example, in the “Instance” category below, there are 2 “Critical” Memory Pressure Warnings evaluations, but the Memory Pressure evaluation is only reporting “Informationals”, not “Warnings” or “Criticals”.


Various memory pressure indicators

In this case, the situation is explained by the fact that there many different indicators of memory pressure.  At any given time, and over time, they do not necessarily all point to the same conclusion.  Thus, you must weigh the related information before concluding that action is warranted (and what action is warranted).


In other cases, the evaluations may benefit from your judgement about the detailed information provided in the findings sheet.  For example, this detail about “Long Elapsed Time Queries” has surfaced that there are some queries that spend very long time in SQL Server.


Queries with long elapsed times


In the first row, there is a query which has an average duration of 72 seconds (third column).  However, it has only be executed 6 times in the period for which these statistics support. 


Egdbhealth does not know the period of the statistics (perhaps they were just flushed a few moments ago).  And, egdbhealth does not know if 6 executions is a lot or a little.  Here, it is more than other queries, but it is not many in absolute terms.  Finally, egdbhealth does not really know what “slow” is for this particular query.  Perhaps this supports a “batch” process that is expected to take a long time.  To make this determination, you would scroll over to the right (not in this screen capture) to view the SQL statement to see what the query is doing.  Then, you can make an informed judgement, based on how your system is used, and the reasonable expectations that users have for its performance, about whether or not these queries with “long elapsed times” are ones that should be actionable for you.


Try to Resolve the Evaluation

Your understanding of the evaluation will guide your efforts to address the problem.  In some cases, such as the one below, egdbhealth will point to Internet-based resources that will help you plan and carry-out the actions.


Some comments provide hyperlinks to additional information


In this case, egdbhealth recognized that the SQL Server instance is running on virtual hardware.  In the case of VMWare (and perhaps other platforms), best practice advice suggests that the minimum server memory and maximum should be set to the same value.  Once you understand it, this change is relatively straight-forward to make and may require only a brief consultation with the virtual machine platform team to confirm that it corresponds with best practices in their minds also.


In other cases, egdbhealth’s guidance will be more oblique and you will need to rely upon specialists within your organization, Esri Technical Support, or your own Internet research to come up with an action plan.  


Sometimes actions will involve changes that will take a considerable amount of organizational and/or system change.  In the example below, egdbhealth is suggesting that the performance of the versioning system could be improved by having less hierarchy in the version tree.  Changing the way versioning is used by an organization is a major undertaking that requires planning and time.  In this case, you can expect to spend time planning changes, socializing them within your organization, and then carrying it out.


Version tree hierarchy refactoring advice


Validate the Resolution

Running egdbhealth again, after your initial efforts to resolve the evaluation(s) will effectively validate whether or not your efforts succeeded.  Note that, when you run egdbhealth again on the same eGDB, the prior Expert Excel file is placed in the “archive” subdirectory for your reference.  (The Content Excel file is not re-created, because its information is less volatile.)


Naturally, you hope to find all of the “Criticals” or “Warnings” that you addressed have disappeared in the new Expert Excel output.  And, this can be expected where you have correctly understood the problem and taken effective action.


For example, a finding such as the one below (that the most recent compress failed) will be resolved in the “OVERVIEW_EXPERT” sheet as soon as you address the problem.  In this case, as soon as you successfully compress and re-run egdbhealth, this evaluation will be resolved.


Failure of recent compress


In a few cases, however, the “Critical” or “Warning” classifications will not fully resolve themselves even though the current condition is no longer the same.   For example, the “Compress During Business Hours” evaluation reports on the recent history of compresses, not just the most recent compress.  You can expect the evaluations to remain unchanged in the “OVERVIEW_EXPERT” sheet for some time. 


History of compresses during business hours


The detail sheet and other sheets in the Versioning category will illustrate that your recent compress did not occur during business hours (if that is the case).  Thus, you have resolved the evaluation.  And, over time, egdbhealth will allow itself to agree.


Finally, you will find that some evaluations are volatile.  In repeated runs of egdbhealth, they will seem to be present or absent without relationship to your specific actions.  For example, the evaluation below reports on the percentage of base table records that are in the delta tables (“A” and “D” tables).  Where those percentages are high, it offers a negative evaluation.


Base and delta table record counts


The action you may have taken in response is to compress the eGDB.  The effectiveness of that action, however, would depend upon the reconciling and posting that is occurring on the system.  So, if there had been no new reconcile and post activity, the compress would not have changed the evaluation.  On the other hand, if there had been reconcile and post activity, or if a very stale version had been deleted, the compress may have resolved many of the findings.  It is also true, however, that even with the ideal reconciles, posts, and compresses, editors might be generating more updates which are populating the delta tables at the same time as you are de-populating them.


The “Memory Metrics” example discussed earlier in this article are another case where you can expect volatility in evaluations.  This is because memory pressure indicators will be triggered by different conditions in the database.  Your informed judgment will be required to determine whether the recurring evaluations indicate a problem that needs further action.


The point is that the goal of taking action is not necessarily to achieve a “clean report card” with no negative evaluations.  The goal should be to have only the evaluations that are appropriate to your system.  In the process, you will have deepened your understanding of your eGDB system and offered many tangible improvements to the users of that eGDB.



The primary purpose of egdbhealth is to help administrators understand and improve the health characteristics of eGDBes.  Focusing on the Expert Excel file output, and prioritizing your analysis based on the Critical/Warning/Informational classification scheme, you can address the aspects of an eGDB which are most in need of investigation.  Some of the evaluations offered by egdbhealth may require various kinds of research to understand and determine a course of action.  Colleagues, Esri Technical Support, and Internet resources can be used to build your knowledge.  When you do take action to improve the health of your eGDB, be sure to run egdbhealth again to validate and document your progress.


I hope you find this helpful, do not hesitate to post your questions here:


Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

Filter Blog

By date: By tag: