Select to view content in your preferred language

What is Egdbhealth?

14685
17
04-26-2019 01:00 AM
DannyKrouk
Esri Contributor
9 17 14.7K

 

Note: This blog post is the first in a series of three planned posts about egdbhealth.  The second in the series will address how to use the tool to evaluate the health of an Enterprise Geodatabase.  The third in the series will address using egdbhealth in a system design context.

Introduction

Egdbhealth is a tool for reporting on various characteristics of Enterprise Geodatabases (eGDBes).  It provides descriptive information about the content of the eGDB and it provides evaluative information about the eGDB.  The evaluative information is the primary purpose of the tool; to surface existing or latent problems/challenges with the eGDB.  The tool works with eGDBes backed by Oracle, PostgreSQL, and SQL Server.

 

Installation and Execution

Although it is not an extension to ArcGIS Monitor, egdbhealth is available for download from the ArcGIS Monitor Gallery: https://arcgismonitor.maps.arcgis.com/home/item.html?id=f343f6b2bbcf434386f6dde1e468e7ab

 

Installation and Pre-Requisites

To install, simply download the .zip archive and extract the files to a directory of your choosing. 

There are some pre-requisites for running egdbhealth.  Those considerations are discussed in the documentation in the .zip archive (egdbhealth_README.docx).  But, at a high level, the pre-requisites are:

 

  1. You need a database client for the type of database (Oracle, PostgreSQL, or SQL Server) to which you are connecting (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/database-clients.htm).
  2. If you are using “Operating System Authentication” with SQL Server, and your eGDB is owned by “dbo” (instead of “sde”), you need to run egdbhealth as a Windows user that is dbo.  In other words, you must connect as the “Geodatabase Administrator” (http://desktop.arcgis.com/en/arcmap/latest/manage-data/gdbs-in-sql-server/geodatabase-administrator-...).  You must connect as the Geodatabase Administrator with Oracle or PostgreSQL, but that usually means connecting as the user called “sde”.
  3. If you are connecting to Oracle or PostgreSQL, you must have enabled the ST_Geometry type library (http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-an-orac... or http://desktop.arcgis.com/en/arcmap/latest/manage-data/databases/add-the-st-geometry-type-to-a-postg...).

 

Execution

Double-clicking egdbhealth.exe will launch a command window and a Windows form. 

 

Follow the prompts on the form to fill out the connection information for your database.  Tooltips provide hints about the nature of the information required.  But, in principle, the information is the same as what you would provide to ArcGIS to connect to the database as the “Geodatabase Administrator.

 

Graphical User Interface

 

The “Test” button will attempt to confirm that you can connect, with the required privileges, to the eGDB.  If the test is successful, the “Test” button will become a “Run” button.  Clicking that will close the form and begin executing the tool in the command window.

 

When the tool completes, it will open the “output” subdirectory which will contain five output files:

 

  1. An HTML file containing metadata about the queried eGDB and RDBMS target and how the connection is made.
  2. A “Context” Excel file that contains descriptive information about the eGDB and RDBMS target
  3. An “Expert” Excel file that contains evaluative information, classified as “Critical”, “Warning”, and “Information”.
  4. A png file that depicts the version tree.
  5. A png file that depicts the state tree.

 

Introducing the Output

An example of the output files is shown below:

 

Output files

 

The files will bear the “GDB Friendly Name” that you specified in the form (in this example, “SQL_GIS”). 

 

The Expert Excel File

The Expert Excel file is the main information artifact.  It provides the evaluative information about the target eGDB system.  For example, if the eGDB has not been compressed recently, there will be an evaluation that reports this as a concern.

 

The file has a summary sheet which provides an overview of the evaluations.  The evaluations are categorized by general topic (first column, “Category”) and classified (Critical, Warning, and Informational columns) such that you may prioritize your review of the information. 

 

The Description column briefly explains the purpose of the evaluation.  Where there is a red marker in the Description cell, there is a hover tip that provides yet more information about what is being evaluated.

 

Expert Excel file Overview sheet

 

Some findings have no records to report.  For example, if there is no problem with a given kind of Geodatabase Consistency, there will be no records.  However, many findings will have some number of records of various classifications. In those cases, the Name (second column) will have a hyperlink to the sheet in the workbook that has the detailed finding records.

 

The Context Excel File

The Context Excel file is similar in structure but lacks the expert (Critical-Warning-Informational) classification columns.  The information is more descriptive and less evaluative.  For example, in this file, you can find a listing of all of Geodatabase Domains and the ObjectClasses to which they are related.  That information is neither good nor bad (i.e. not evaluative).  But, it may be useful to know.

 

Context Excel file Overview sheet

 

Records that are highlighted in green have SQL statements that you can run for additional information, as appropriate.  Usually, these SQL statements are too expensive to run on all of the content in the eGDB.  But, someone familiar with the eGDB and the issues it has may have ideas about which queries would be useful to run nonetheless.  For example, the “SqlGeomTypeSizeSql” sheet has a SQL query for each FeatureClass in the eGDB.  If you run one of these queries it will report the sizes of the geometries in one FeatureClass.  This is an expensive enough operation that it would not be appropriate to run it on all of the FeatureClasses by default.  But, if there is a FeatureClass that has a performance problem, it may be useful for you to run the query for that FeatureClass to examine the sizes of its geometries.

 

The PNGes

The PNG files (a version tree diagram and a state tree diagram) are typically of interest if your eGDB has data that has been registered as versioned.

 

The version tree graph illustrates the version tree hierarchy.  Color coding (red-yellow-green) indicates the relative degree of “staleness”, or how long it has been since the version has been edited or reconciled.

 

Version tree

 

The state tree schematic illustrates the depth and structure of the state tree (which is the detailed structure upon which the version tree relies).  The Default version is shown in red and “State Zero” is shown in green.  The further these nodes are separated, the more expensive it is for the database to return information about the Default version (the most commonly used version in most systems).

 

State tree

 

The HTML

The HTML provides some general information about the eGDB, RDBMS, and machine that is the target of the evaluation.  It may only be of passing interest in many cases.

 

HTML metadata

 

Summary

This article as described the purpose of egdbhealth, how to run it, and what its outputs are.  As the outputs of the tool contain quite a bit of technical information, other articles will address how to use the outputs for (a) understanding and improving eGDB health (b) designing GIS systems.

 

I hope you find this helpful, do not hesitate to post your questions here: https://community.esri.com/thread/231451-arcgis-architecture-series-tools-of-an-architect

 

Note: The contents presented above are recommendations that will typically improve performance for many scenarios. However, in some cases, these recommendations may not produce better performance results, in which case, additional performance testing and system configuration modifications may be needed.

17 Comments
by Anonymous User
Not applicable

@DannyKrouk The download link no longer works.

DannyKrouk1
New Explorer

Yes, sorry about that.  I don't know what is going on.  This link should work: https://arcgismonitor.maps.arcgis.com/home/item.html?id=f343f6b2bbcf434386f6dde1e468e7ab.  Now, I have to figure out how to edit this crazy blog when the system does not recognize my original account.  Sigh ...

 

DannyKrouk
Esri Contributor

The link is fixed now. 🙂

 

Sjavvaji
New Explorer

Hello, I am attempting to download the egdbhealth from the below-mentioned link, but I was unable to do so even after using multiple logins. I am attaching the error below; can anyone assist me in resolving this issue?

link: https://arcgismonitor.maps.arcgis.com/home/item.html?id=f343f6b2bbcf434386f6dde1e468e7ab

SaiKumar_0-1672145861385.png

 

DannyKrouk1
New Explorer
I'm sorry about the problem. I suffer the same problem myself. I'll contact the owner of the ArcGIS Monitor account and ask them to look into it. - Danny
DannyKrouk
Esri Contributor

I believe the download link issues have been resolved.

SGTomlins
Regular Contributor

@DannyKrouk  - I keep finding more fun tools that you guys put out there.  Makes it possible to have all the answers I need to better manage the Enterprise.  I love this stuff.  🙂

I reviewed the output expert xlsw file.  I discovered the reporting of my backups with a warning of "No recent backups" (db_Backups).  I reviewed my SQL Server logs and find the backups are working as expected and as recent as last night.

The report of the (inst_ErrorsInCurrentLog) does show successful recent backup information.

Just an FYI to consider.  Not a biggie to me.  I was nudged to review my settings and tasks.  Always good to periodically check stuff..  

All the best,

-Steve

DannyKrouk
Esri Contributor

@SGTomlins  Thanks for the kind words and the feedback.  

This is the query that supports the backup findings:

declare @min_backups int = 1
declare @max_minutes_since_backup int = 10080 -- one week
declare @backups int
select @backups = count(*) from msdb.dbo.backupset s where s.database_name = DB_NAME() 
 
if (@backups < @min_backups )
select cast(@backups as nvarchar(8)) as backups, 'Critical: Fewer than ' + cast(@min_backups as nvarchar(8)) +  ' backups exist' as comment
else
SELECT top 10
DATEDIFF(mi, s.backup_start_date, getdate()) minutes_since_backup,
s.database_name,
m.physical_device_name,
CAST(CAST(s.backup_size / 1000000 AS INT) AS VARCHAR(14)) + ' ' + 'MB' AS bkSize,
CAST(DATEDIFF(second, s.backup_start_date,
s.backup_finish_date) AS VARCHAR(4)) + ' ' + 'Seconds' TimeTaken,
s.backup_start_date,
CASE s.[type]
WHEN 'D' THEN 'Full'
WHEN 'I' THEN 'Differential'
WHEN 'L' THEN 'Transaction Log'
END AS BackupType,
s.server_name,
s.recovery_model,
'Warning: No recent backups.  Review back-up procedures relative to requirements. Age (in days): ' + cast( (DATEDIFF(mi, s.backup_start_date, getdate()) / 1440)   as nvarchar(8))   as comment
FROM msdb.dbo.backupset s
INNER JOIN msdb.dbo.backupmediafamily m ON s.media_set_id = m.media_set_id
WHERE 
s.database_name = DB_NAME() 
and
DATEDIFF(mi, s.backup_start_date, getdate()) > @max_minutes_since_backup
ORDER BY DATEDIFF(mi, s.backup_start_date, getdate()) ASC

It strikes me that there are a few possibilities that explain your outcome:

1. There is filter logic in my query which has a different idea than you about what "recent" is.
2. There is a flaw in the query logic relative to how (some more recent version of?) SQL Server records its back-up activity.
3. There is some kind of SQL Server back-up happening that would not normally be recorded in the system tables queried ... for some reason (really a variant of #2)

As you may be able to tell from the query text, what we are examining are tables like msdb.dbo.backupset and msdb.dbo.backupmediafamily.  Do the back-ups that you see in your error log also appear in these tables?
SGTomlins
Regular Contributor

@DannyKrouk :  I find in both tables the most current backup sets (today) listed. 

[name] 

SGTomlins_0-1721764953160.png

FROM [msdb].[dbo].[backupset]   

and 

[physical_device_name] 

SGTomlins_1-1721765041199.png

FROM 

[msdb].[dbo].[backupmediafamily]

______________________________

Yes, I do see the backups listed in the SQL tables.  Accompanied by files in their corresponding locations.

The message on the db_Backup tab is the "Warning" in your provided SQL for this backup set

SGTomlins_2-1721765209694.png

This [name] and [physical device name] both show up in their respective tables.  The backups are scheduled SQL Maintenance plans.

This is our SQL Server ver. 

SGTomlins_3-1721765722842.png

 

Thanks for your response.  Hope this helps if there is some logic that needs tinkering or otherwise...

-Steve

 

 

 

 

DannyKrouk
Esri Contributor

@SGTomlins If you would, please run the SQL statement against your system in SSMS and show me a screen capture of the result in an email (dkrouk@esri.com).  Thank you. 🙂

SGTomlins
Regular Contributor

Here ya go @DannyKrouk ...  Ran it this morning.  

SGTomlins_0-1721824877117.png

-Steve

DannyKrouk
Esri Contributor

@SGTomlins  Thanks.  The query is seeing your backups.  That's good. 

From appearances, the default definition of "recent" for the egdbhealth query is out of alignment with your default definition of recent.  It sees your back-ups.  They are all from a week or more ago.  If you run this tool right after you run take your backup, the results should be different.  

Do you see another interpretation of this data?

SGTomlins
Regular Contributor

@DannyKrouk  I am not certain about what you mean by my "default definition of recent".  I ran the egdb the morning after the normal scheduled backups.  It returned the day 7/16/24 for the top of the DESC ORDER BY, when I ran your snippet of SQL it reported 7/17/24.

I ran this today:

SELECT
bs.database_name,
bs.backup_start_date,
bs.backup_finish_date,
bs.

 

expiration_date,
CASE bs.type
WHEN 'D' THEN 'Database'
WHEN 'L' THEN 'Log'
WHEN 'I' THEN 'Differential'
WHEN 'F' THEN 'File or Filegroup'
WHEN 'G' THEN 'Differential File'
WHEN 'P' THEN 'Partial'
WHEN 'Q' THEN 'Differential Partial'
ELSE 'Unknown'
END AS backup_type,
bmf.physical_device_name
FROM
msdb.dbo.backupset bs
INNER JOIN
msdb.dbo.backupmediafamily bmf
ON bs.media_set_id = bmf.media_set_id
WHERE
bs.backup_start_date > DATEADD(day, -30, GETDATE()) -- Last 30 days
ORDER BY
bs.backup_start_date DESC;

 

It returns this:

 

SGTomlins_1-1721851038194.png

Most recent as of 7/24/24.

Maybe I am looking at this incorrectly.  Is the "gap" in dates that is reported actually showing backups that are "out of date" and toss your "Warning"?

-Steve

 

 

DannyKrouk
Esri Contributor

@SGTomlins  Could we please continue this conversation by email at dkrouk@esri.com?  I think you and I would benefit from an opportunity to coordinate our thinking and communication in details that I do not think are relevant to this forum.  Perhaps we need to graduate to a screen share to truly understand what is going on, I don't know.  But, this forum-post method, from my perspective, has no more gas in the tank for our purposes.

SGTomlins
Regular Contributor

@DannyKrouk   I will send you an email..  🙂

 

DannyKrouk
Esri Contributor

Thanks to @SGTomlins for working with me outside of this thread.  We successfully found the flaw in my query logic and resolved it.  The repaired query is in the online repo that egdbhealth checks on start-up.  So, the new query will be downloaded and run on everyone's systems the next time they run egdbhealth. 🙂