Note: This blog post is the first in a series of three planned posts about egdbhealth. The second in the series will address how to use the tool to evaluate the health of an Enterprise Geodatabase. The third in the series will address using egdbhealth in a system design context.
Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes). It provides descriptive information about the content of the eGDB and it provides evaluative information about the eGDB. The evaluative information is the primary purpose of the tool; to surface existing or latent problems/challenges with the eGDB. The tool works with eGDBes backed by Oracle, PostgreSQL, and SQL Server.
Installation and Execution
Although it is not an extension to ArcGIS Monitor, egdbhealth is available for download from the ArcGIS Monitor Gallery: https://www.arcgis.com/home/item.html?id=ea4bbf9b46084dc49efae9889832aa22.
Installation and Pre-Requisites
To install, simply download the .zip archive and extract the files to a directory of your choosing.
There are some pre-requisites for running egdbhealth. Those considerations are discussed in the documentation in the .zip archive (egdbhealth_README.docx). But, at a high level, the pre-requisites are:
- You need a database client for the type of database (Oracle, PostgreSQL, or SQL Server) to which you are connecting (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/database-clients.htm).
- If you are using “Operating System Authentication” with SQL Server, and your eGDB is owned by “dbo” (instead of “sde”), you need to run egdbhealth as a Windows user that is dbo. In other words, you must connect as the “Geodatabase Administrator” (http://desktop.arcgis.com/en/arcmap/latest/manage-data/gdbs-in-sql-server/geodatabase-administrator-sqlserver.htm). You must connect as the Geodatabase Administrator with Oracle or PostgreSQL, but that usually means connecting as the user called “sde”.
- If you are connecting to Oracle or PostgreSQL, you must have enabled the ST_Geometry type library (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-an-oracle-database.htm or http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-a-postgresql-database.htm).
Double-clicking egdbhealth.exe will launch a command window and a Windows form.
Follow the prompts on the form to fill out the connection information for your database. Tooltips provide hints about the nature of the information required. But, in principle, the information is the same as what you would provide to ArcGIS to connect to the database as the “Geodatabase Administrator.
The “Test” button will attempt to confirm that you can connect, with the required privileges, to the eGDB. If the test is successful, the “Test” button will become a “Run” button. Clicking that will close the form and begin executing the tool in the command window.
When the tool completes, it will open the “output” subdirectory which will contain five output files:
- An HTML file containing metadata about the queried eGDB and RDBMS target and how the connection is made.
- A “Context” Excel file that contains descriptive information about the eGDB and RDBMS target
- An “Expert” Excel file that contains evaluative information, classified as “Critical”, “Warning”, and “Information”.
- A png file that depicts the version tree.
- A png file that depicts the state tree.
Introducing the Output
An example of the output files is shown below:
The files will bear the “GDB Friendly Name” that you specified in the form (in this example, “SQL_GIS”).
The Expert Excel File
The Expert Excel file is the main information artifact. It provides the evaluative information about the target eGDB system. For example, if the eGDB has not been compressed recently, there will be an evaluation that reports this as a concern.
The file has a summary sheet which provides an overview of the evaluations. The evaluations are categorized by general topic (first column, “Category”) and classified (Critical, Warning, and Informational columns) such that you may prioritize your review of the information.
The Description column briefly explains the purpose of the evaluation. Where there is a red marker in the Description cell, there is a hover tip that provides yet more information about what is being evaluated.
Some findings have no records to report. For example, if there is no problem with a given kind of Geodatabase Consistency, there will be no records. However, many findings will have some number of records of various classifications. In those cases, the Name (second column) will have a hyperlink to the sheet in the workbook that has the detailed finding records.
The Context Excel File
The Context Excel file is similar in structure but lacks the expert (Critical-Warning-Informational) classification columns. The information is more descriptive and less evaluative. For example, in this file, you can find a listing of all of Geodatabase Domains and the ObjectClasses to which they are related. That information is neither good nor bad (i.e. not evaluative). But, it may be useful to know.
Records that are highlighted in green have SQL statements that you can run for additional information, as appropriate. Usually, these SQL statements are too expensive to run on all of the content in the eGDB. But, someone familiar with the eGDB and the issues it has may have ideas about which queries would be useful to run nonetheless. For example, the “SqlGeomTypeSizeSql” sheet has a SQL query for each FeatureClass in the eGDB. If you run one of these queries it will report the sizes of the geometries in one FeatureClass. This is an expensive enough operation that it would not be appropriate to run it on all of the FeatureClasses by default. But, if there is a FeatureClass that has a performance problem, it may be useful for you to run the query for that FeatureClass to examine the sizes of its geometries.
The PNG files (a version tree diagram and a state tree diagram) are typically of interest if your eGDB has data that has been registered as versioned.
The version tree graph illustrates the version tree hierarchy. Color coding (red-yellow-green) indicates the relative degree of “staleness”, or how long it has been since the version has been edited or reconciled.
The state tree schematic illustrates the depth and structure of the state tree (which is the detailed structure upon which the version tree relies). The Default version is shown in red and “State Zero” is shown in green. The further these nodes are separated, the more expensive it is for the database to return information about the Default version (the most commonly used version in most systems).
The HTML provides some general information about the eGDB, RDBMS, and machine that is the target of the evaluation. It may only be of passing interest in many cases.
This article as described the purpose of egdbhealth, how to run it, and what its outputs are. As the outputs of the tool contain quite a bit of technical information, other articles will address how to use the outputs for (a) understanding and improving eGDB health (b) designing GIS systems.
I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect
Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.