Of all the programming languages I know and use, Python is at the bottom so I want to post this script to see if what I have is as efficient as it can be. The Backstory: Our organization has a web map showing stream gage information. The gage information comes into and is stored in an Access. The third party stream gage software runs an annual maintenance on the Access database so we cannot just link to the table directly because that would conflict with database maintenance that occurs.
My solution was to develop a simple script which would connect to the Access database, select all records in a given query, and then transfer that information over to a table in SDE which, in turn, is consumed by the map's services. The script runs as a windows task every five minutes and performs an initial test to see if it's "safe" to connect to the Access Database. If it is, the data is read and passed on to the table. The number of rows in the Access database on average is roughly 2200-2400 records. Initially, the script took about 40 seconds to run but lately it has taken longer. (hence my post about it).
So- here is the script. I've left all the comments in it so hopefully there's not any confusion about what's happening. Is there a better way to approach and accomplish this task? Thanks!
Steve
Script:
import sys import os import linecache import logging import arcpy from datetime import datetime from arcpy import env file01 = r"\\pmc-floodwatch\DIADvisorDatabases\DvLive.mok" #This file must exist file02 = r"\\pmc-floodwatch\DIADvisorDatabases\DvLive.mno" #This file must NOT exist expression = '1=1' #SQL shorthand which select all records theTable = "tblGageData" #Establish the error log file logger = logging.getLogger('errorLog') hdlr = logging.FileHandler(r'\\snoco\gis\pw\tes\spwscc\python\errorLog.log') logger.addHandler(hdlr) # The tables within DIADvisor must not be accessed during its daily database maintenance. # OneRain recommends checking for the existence and non-existence of two specific files. # If both conditions are true, it is safe to proceed with connecting to the data within # the dvLive Access database if os.path.exists(file01) and not os.path.exists(file02): print "Processing start time: " + str(datetime.now()) env.workspace = r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde" try: # Set some local variables tempTableView = "gageTableView" # Execute MakeTableView arcpy.MakeTableView_management(theTable, tempTableView) # Execute SelectLayerByAttribute to select all records arcpy.SelectLayerByAttribute_management(tempTableView, "NEW_SELECTION", expression) # Execute GetCount and if some records have been selected, then execute # DeleteRows to delete the selected records. if int(arcpy.GetCount_management(tempTableView).getOutput(0)) > 0: arcpy.DeleteRows_management(tempTableView) # Now connect to the DIADvisor access database and import the most recent data # This requires the OLD DB connection previously established using ArcCatalog counter = 0 accessRows = arcpy.SearchCursor(r"C:\gishome\tasks\flood_warning_system\jetConnectForDvLive.odc\last3days") curSde = arcpy.InsertCursor(r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde\tblGageData") # Loop through the results returned via the OLE DB connection for cRow in accessRows: curSensorId = cRow.sensor_id curEpoch = cRow.epoch curData = cRow.data curDataValue2 = cRow.dataValue2 counter += 1 #Insert a new row into the SDE table with the current DIADvisor record's information row = curSde.newRow() row.SENSOR_ID = curSensorId row.EPOCH = curEpoch row.DATA = curData row.dataValue2 = curDataValue2 curSde.insertRow(row) # We're done so perform some variable cleanup del row del accessRows del curSde del cRow print "Number of record(s) in the DIADvisor database: " + str(counter) print "Processing end time: " + str(datetime.now()) except Exception as e: # If an error occurred, print line number and error message exc_type, exc_obj, exc_tb = sys.exc_info() fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1] theMessage = "\n" + 80*"#" + "\n" + 80*"#" + "\n" theMessage = theMessage + "DATE/TIME: " + str(datetime.now()) + ":" + "\n" theMessage = theMessage + "EXECPTION: " + str(e) + "\n" + "\n" theMessage = theMessage + "CALLBACK TRACE: " + "\n" theMessage = theMessage + 20*" " + "File: " + str(exc_tb.tb_frame.f_code.co_filename) + "\n" theMessage = theMessage + 20*" " + "Line " + str(exc_tb.tb_lineno) + ": " + str(linecache.getline(exc_tb.tb_frame.f_code.co_filename, exc_tb.tb_lineno)) theMessage = theMessage + 20*" " + "Exception Type: " + str(exc_type) print theMessage logger.error(theMessage) else: sys.exit()
Hi Steve,
One way to speed up your script is to use the data access module (arcpy.da). ArcGIS 10.1 introduced this and the performance is much faster than the previous search/insert/update cursors.
I second this. Use the arcpy.da module for cursors and use them in a with statement so the cursors are automatically deleted even if there was an error. Your current code may not delete the cursors if there was an error.
Blake T I have switched my code base over to using the da module and it runs successfully. Can you point me to a code sample that illustrates using the "with" statement?
There are some basic examples in the Esri help documentation
ArcGIS Help 10.2 - SearchCursor (arcpy.da)
For your particular code, it's as simple as just starting with the cursors in nested with statements and removing the del lines afterwards.
with arcpy.SearchCursor(r"C:\gishome\tasks\flood_warning_system\jetConnectForDvLive.odc\last3days") as accessRows: with arcpy.InsertCursor(r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde\tblGageData") as curSde: for cRow in accessRows: # Loop through the results returned via the OLE DB connection ## #Insert a new row into the SDE table with the current DIADvisor record's information ## print "Number of record(s) in the DIADvisor database: " + str(counter) print "Processing end time: " + str(datetime.now())
Gotcha. Thanks!
In your script it appears you are deleting ALL records from the SDE table prior to the insert. Is the goal to remove all the records prior to the insert cursor?
If this is such the case, I would not recommend using the Delete Rows command, but rather the Truncate Table command. In SQL, truncate is much more efficient than delete. However, this requires that your table is not registered as versioned.
So, I was making the change from deleteRows to truncate and got the following message:
"ERROR 001400: Only the data owner may execute truncate."
I'm not quite sure what to do about this. I was running the script manually but using the same db connection string that our automated task uses. How do I overcome this?
Hi Steve, we would just need to use the account of the data owner. In ArcCatalog, you may see the feature class displayed as "SDE.tblGageData". In this case, "SDE" would be the data owner. To overcome this, the username from the database connection would need to match.
So, the connection string to SDE is using operating system authentication instead of database authentication. Would I need to use db authentication instead? And then, which account? SDE?