I'm trying to build some SDE maintenance Python scripts and found this help document very informative.
ArcGIS Help 10.2 - Using Python scripting to batch reconcile and post versions
However, I noticed that the detailed section on Rebuild Indexes and Analyze Datasets does not uses the NO_SYSTEM option for include_system, but the full example code at the bottom does. Could someone clarify exactly what the include_system option does and which is the recommended way to use it? Maybe a Python expert can help?
I don't know if this helps or not
>>> import arcpy
>>> help(arcpy.RebuildIndexes_management)
Help on function RebuildIndexes in module arcpy.management:
RebuildIndexes(input_database=None, include_system=None, in_datasets=None, delta_only=None)
RebuildIndexes_management(input_database, include_system, {in_datasets;in_datasets...}, {delta_only})
Updates indexes of datasets and system tables stored in an enterprise
geodatabase. This tool is used in enterprise geodatabases to rebuild existing
attribute or spatial indexes. Out-of-date indexes can lead to poor geodatabase
performance.
INPUTS:
input_database (Workspace):
The enterprise database that contains the data to be updated.
include_system (Boolean):
Indicates whether indexes will be rebuilt on the states and state lineages
tables.You must be the geodatabase administrator for this option to be executed
successfully.This option only applies to geodatabases. If the input workspace is
a database
this option will be ignored.
* NO_SYSTEM— Indexes will not be rebuilt on the states and state lineages table.
This is the default.
* SYSTEM— Indexes will be rebuilt on the states and state lineages tables.
in_datasets {String}:
Names of the datasets that will have their indexes rebuilt. Dataset names use
paths relative to the input workspace; full paths are not accepted as input.
delta_only {Boolean}:
Indicates how the indexes will be rebuilt on the selected datasets. This option
has no effect if input_datasets is empty.This option only applies to
geodatabases. If the input workspace is a database
this option will be ignored.
* ALL—Indexes will be rebuilt on all indexes for the selected datasets. This
includes spatial indexes as well as user-created attribute indexes and any
geodatabase-maintained indexes for the dataset.
* ONLY_DELTAS—Indexes will only be rebuilt for the delta tables of the selected
datasets. This option can be used for cases where the business tables for the
selected datasets are not updated often and there are a high volume of edits in
the delta tables. This is the default.
I appreciate the reply, Wes, but I think that's all the same info I've been able to find in the online Help documentation on these tools. I'm really looking for advice from the community on what settings you use with this tools as part of regularly scheduled (nightly) SDE maintenance with a compress operation.
Here's what I've come up with so far for Rebuild Indexes and Analyze Datasets.
import arcpy from contextlib import contextmanager import os import shutil import tempfile def main(): try: sdeInstance = "GTEST" # Get all schema owners from SDE table registry sde_sdeconn = r"C:\GISConnections\SDE@{}.sde".format(sdeInstance) in_table = os.path.join(sde_sdeconn, "SDE.TABLE_REGISTRY") sql_prefix = "DISTINCT OWNER" schemaOwners = [ row[0] for row in arcpy.da.SearchCursor( in_table, "OWNER", sql_clause=(sql_prefix, None) ) ] # Rebuild indexes and update statistics for all objects under each schema owner with makeTempDir() as temp_dir: for owner in sorted(schemaOwners): print "\n---\n" ## Set passwords, options, and workspace with owner if owner.upper() == "SDE": ## Must be geodatabase administrator to gather statistics on ## the states and state lineages tables (include_system option). include_system = "SYSTEM" arcpy.env.workspace = sde_sdeconn else: ## Create connection file with owner arcpy.CreateDatabaseConnection_management( temp_dir, ## out_folder_path owner+".sde", ## out_name "ORACLE", ## database_platform sdeInstance+".WORLD", ## instance "DATABASE_AUTH", ## account_authentication owner, ## username ownerPass, ## password ) print arcpy.GetMessages() include_system = "NO_SYSTEM" arcpy.env.workspace = os.path.join(temp_dir, owner+".sde") # Get a list of all the objects owner has access to workspace = arcpy.env.workspace tables = arcpy.ListTables(owner+'.*') tables += arcpy.ListFeatureClasses(owner+'.*') tables += arcpy.ListRasters(owner+'.*') for dataset in arcpy.ListDatasets(owner+'.*'): tables += arcpy.ListFeatureClasses(feature_dataset=dataset) try: # Rebuild Indexes arcpy.RebuildIndexes_management( workspace, ## input_database include_system, tables, "ALL" ) except Exception as err: ## Print exception error only if it's not from Esri if not arcpy.GetMessages(2): print err finally: if arcpy.GetMessages(): print arcpy.GetMessages() try: # Analyze Datasets arcpy.AnalyzeDatasets_management( workspace, ## input_database include_system, tables, "ANALYZE_BASE", "ANALYZE_DELTA", "ANALYZE_ARCHIVE" ) except Exception as err: ## Print exception error only if it's not from Esri if not arcpy.GetMessages(2): print err finally: ## Print the full details of the last Esri message if arcpy.GetMessages(): print arcpy.GetMessages() arcpy.ClearWorkspaceCache_management() except Exception as err: if err.message in arcpy.GetMessages(2): print arcpy.GetMessages() else: print unicode(err).encode("utf-8") finally: # Cleanup arcpy.ClearWorkspaceCache_management() @contextmanager def makeTempDir(): temp_dir = tempfile.mkdtemp() try: yield temp_dir finally: shutil.rmtree(temp_dir) if __name__ == '__main__': main()
It seems to be doing its job, but I don't see any entries in the SDE.SDE_LAYER_STATS table; should I? Eventually this will get rolled into our existing SDE maintenance script that disconnects users then reconciles and compresses.
HI Blake,
thanks very much for sharing your code, I have got a somewhat similar system automated in out system (no where near as clean cut as yours, mine is very 'backyard' code).
I would like to ask an additional question, I see you analyze Datasets on all your tables, can someone claify if this is necessary? I understand this is handy right after performing a compress on the database as it does something (i am unsure what) to all the system tables, but I am unsure what it does with regards to standard feature classes? So i'm not sure if it is necessary for me to run every week on each of my feature classes within my SDE Database. Can you comment on why you run it on your entire system?
One other question i just thought of, I notice you run your rebuild index BEFORE you run the Analyze Datasets, is there a reason you have decided to run in this order? I cant find it now, but believe in the past I read that they should be run in the other order, directly after compressing the database. Im hoping someone can set me straight with that one too...
I see you analyze Datasets on all your tables, can someone claify if this is necessary?
I talked to Esri support to try and confirm I was on the right track with my process here. I explained very clearly I'm doing the rebuild and analyze on every table and feature class. They confirmed that I was doing things correctly but noted that it could take a long time to do everything. However, they did't say I should not do it.
I notice you run your rebuild index BEFORE you run the Analyze Datasets, is there a reason you have decided to run in this order?
I think I did it in that order only because that's the way the original documentation did it. Somewhere in the documentation it also mentioned possibly running one (or both of these) before the compress as well as after, so it's a little confusing in which scenarios you would do something like that.
EDIT:
I just stumbled across this post that recommends running analyze datasets before reconcile/post/compress.
While this script has been running successfully for a while now, I'm noticing that it takes about 12 minutes longer to complete every night. This has been a steady increase and it now takes hours to finish where it originally took less than 30 minutes. My plan is to adjust the script so it will only rebuild and analyze the tables that were modified. Here is a sample query (Oracle) I will adjust and simplify to find the modified tables.
SELECT TR.OWNER || '.' || TR.TABLE_NAME AS LAYER_NAME, MM.REGISTRATION_ID, COUNT(distinct(S.STATE_ID)) AS AFFECTED_STATES, COUNT((SL.LINEAGE_ID)) AS LINEAGE_LENGTH FROM SDE.STATES S INNER JOIN SDE.MVTABLES_MODIFIED MM ON S.STATE_ID = MM.STATE_ID JOIN SDE.TABLE_REGISTRY TR ON MM.REGISTRATION_ID = TR.REGISTRATION_ID JOIN SDE.STATE_LINEAGES SL ON S.LINEAGE_NAME = SL.LINEAGE_NAME GROUP BY TR.OWNER, TR.TABLE_NAME, MM.REGISTRATION_ID ORDER BY LAYER_NAME;
Here's the query I'm using to identify which tables have edits (and who made the edits).
SELECT DISTINCT
tr.owner as TABLE_OWNER, tr.table_name,
COUNT (mm.registration_id) AS STATE_COUNT,
s.owner AS STATE_OWNER
FROM sde.mvtables_modified mm
LEFT OUTER JOIN sde.states s
ON mm.state_id = s.state_id
LEFT OUTER JOIN sde.table_registry tr
ON mm.registration_id = tr.registration_id
GROUP BY tr.owner, tr.table_name, s.owner
ORDER BY tr.owner, tr.table_name, s.owner
Turns out the slowness I was experiencing with rebuild indexes and analyze datasets was from excessive geoprocessing history logged in the geodatabase metadata. Using arcpy.SetLogHistory(False) will prevent the geoprocessing history generated in the script to be logged. Here's the script I used to clean out the geoprocessing history from the geodatabase. Once the geoprocessing history was cleaned up, our maintenance script ran very quickly again.
import arcpy
from contextlib import contextmanager
import os
import shutil
import tempfile
def main():
sde_sdeconn = r"C:\GISConnections\SDE@GPRO.sde"
xslt_path = r"C:\arcgis\Desktop10.2\Metadata\Stylesheets\gpTools\remove geoprocessing history.xslt"
try:
arcpy.ClearWorkspaceCache_management()
if os.path.exists(xslt_path):
with makeTempDir() as temp_dir:
## Get unique temporary XML file name
with tempfile.NamedTemporaryFile(
dir=temp_dir, suffix=".xml", delete=True
)as temporary_file:
temporary_file_name = temporary_file.name
## Export metadata without geoprocessing history
arcpy.XSLTransform_conversion(
sde_sdeconn, ## source
xslt_path, ## xslt
temporary_file_name ## output
)
print arcpy.GetMessages()
print ""
## Import metadata that has geoprocessing history removed
arcpy.MetadataImporter_conversion(
temporary_file_name, ## source
sde_sdeconn ## target
)
print arcpy.GetMessages()
else:
raise IOError("File not found.\n{}".format(xslt_path))
except Exception as err:
if err.message in arcpy.GetMessages(2):
## All of the messages returned by the last ArcPy tool
displayErr = arcpy.GetMessages()
else:
## Non-ArcPy error message
displayErr = unicode(err).encode("utf-8")
print displayErr
finally:
# Cleanup
arcpy.ClearWorkspaceCache_management()
@contextmanager
def makeTempDir():
"""Creates a temporary folder and returns the full path name.
Use in with statement to delete the folder and all contents on exit.
Requires contextlib contextmanager, shutil, and tempfile modules.
"""
temp_dir = tempfile.mkdtemp()
try:
yield temp_dir
finally:
shutil.rmtree(temp_dir)
if __name__ == '__main__':
main()
Geoprocessing history is also logged in the metadata of feature datasets and feature classes, and tables. You can modify this script to walk the geodatabase and remove the geoprocessing history for each of those things if you like. Here is some further reading from Esri Support:
Editing metadata for many ArcGIS items
http://resources.arcgis.com/en/help/main/10.1/index.html#//003t00000026000000
XSLT Transformation (Conversion)
http://resources.arcgis.com/en/help/main/10.1/index.html#//001200000017000000
Automate the process of deleting geoprocessing history
http://support.esri.com/en/knowledgebase/techarticles/detail/41026
Hi Blake,
how come you didn't use the script mentioned here?
The Esri support tech I spoke with did not show me this technical article. This looks like a better solution, thank you for posting.
EDIT:
I just tested the isNotSpatialView() function for Oracle and it isn't reliable. At least in our 10.5.1 enterprise geodatabase on 12c. Almost none of our spatial views are actually registered with the database in the user_views table so it doesn't validates correctly.
Hi Blake! Question: which user are you using to run the analyze tool? I tried to run it with the owner and did not have problems, but I could not run it with the sde user. Thanks!