We have several Python scripts that perform daily geoprocessing tasks. Since geoprocessing history is automatically logged, the GDB_ITEMS table in our geodatabases bloats considerably and affects performance significantly. I discovered the effect on performance by first taking the workaround code suggested by ESRI and modifying it so that I could run it at the geodatabase level to affect all objects in the geodatabase (the original workaround would take a VERY long time to actually perform if I had to go to each feature class individually). Once I had cleaned up over two years of geoprocessing history from the GDB_ITEMS table of all of my geodatabases, I found that:
1) The reserved disk space for all of the SDE databases on my SQL Server dropped by nearly 11 Gb.
2) My automation scripts run MUCH faster. For instance, one job that performs a DB compression and updates statistics went from taking 4 hours and 13 minutes to taking 8 minutes! Another script that deletes several point feature classes and recreates them using X/Y coordinates downloaded from a business database went from taking 2 hours and 25 minutes to taking 15 minutes.
Having some method to manage logging to the GDB_ITEMS table would be an excellent enhancement!
Having realised a need for relatively opposite case, as mentioned in Allow IN_MEMORY datasets to store metadata generated by geo-processing tools , I think there could be an interim option instead of on/off switch which let users to control the amount of history to keep with the data items, say the last 20 geoprocessing actions. This history info is just auxiliary most of the time but sometimes it may become critical, particularly if a data item has a lineage related to the source items and their locations (say intersect operation applied and accompanying datasets to control source data versions, instead of rerunning a long intersect operation to make sure you used correct source datasets).
Great idea. I believe the default should be on, though. I have figured out the what and where hundreds of files just from the GP history. Most of the field calc history is a waste of time to read, but the appends are useful. None of these files had real metadata (documentation), and you can't count on most people to create any metadata, especially non-GIS non-IT people.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.