Could this be better (data transfer Access -> SDE)?

5181
16
02-20-2015 08:13 AM
SteveCole
Frequent Contributor

Of all the programming languages I know and use, Python is at the bottom so I want to post this script to see if what I have is as efficient as it can be. The Backstory: Our organization has a web map showing stream gage information. The gage information comes into and is stored in an Access. The third party stream gage software runs an annual maintenance on the Access database so we cannot just link to the table directly because that would conflict with database maintenance that occurs.

My solution was to develop a simple script which would connect to the Access database, select all records in a given query, and then transfer that information over to a table in SDE which, in turn, is consumed by the map's services. The script runs as a windows task every five minutes and performs an initial test to see if it's "safe" to connect to the Access Database. If it is, the data is read and passed on to the table. The number of rows in the Access database on average is roughly 2200-2400 records. Initially, the script took about 40 seconds to run but lately it has taken longer. (hence my post about it).

So- here is the script. I've left all the comments in it so hopefully there's not any confusion about what's happening. Is there a better way to approach and accomplish this task? Thanks!

Steve

Script:

import sys
import os
import linecache
import logging
import arcpy
from datetime import datetime
from arcpy import env


file01 = r"\\pmc-floodwatch\DIADvisorDatabases\DvLive.mok" #This file must exist
file02 = r"\\pmc-floodwatch\DIADvisorDatabases\DvLive.mno" #This file must NOT exist
expression = '1=1' #SQL shorthand which select all records
theTable = "tblGageData"


#Establish the error log file 
logger = logging.getLogger('errorLog')
hdlr = logging.FileHandler(r'\\snoco\gis\pw\tes\spwscc\python\errorLog.log')
logger.addHandler(hdlr)


# The tables within DIADvisor must not be accessed during its daily database maintenance.
# OneRain recommends checking for the existence and non-existence of two specific files.
# If both conditions are true, it is safe to proceed with connecting to the data within
# the dvLive Access database
if os.path.exists(file01) and not os.path.exists(file02):
        print "Processing start time: " + str(datetime.now())
        
        env.workspace = r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde"
        try:
                # Set some local variables
                tempTableView = "gageTableView"


                # Execute MakeTableView
                arcpy.MakeTableView_management(theTable, tempTableView)


                # Execute SelectLayerByAttribute to select all records
                arcpy.SelectLayerByAttribute_management(tempTableView, "NEW_SELECTION", expression)


                # Execute GetCount and if some records have been selected, then execute
                #  DeleteRows to delete the selected records.
                if int(arcpy.GetCount_management(tempTableView).getOutput(0)) > 0:
                        arcpy.DeleteRows_management(tempTableView)
                
                # Now connect to the DIADvisor access database and import the most recent data
                # This requires the OLD DB connection previously established using ArcCatalog
                counter = 0


                accessRows = arcpy.SearchCursor(r"C:\gishome\tasks\flood_warning_system\jetConnectForDvLive.odc\last3days")
                curSde = arcpy.InsertCursor(r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde\tblGageData")
                # Loop through the results returned via the OLE DB connection
                for cRow in accessRows:
                    curSensorId = cRow.sensor_id
                    curEpoch = cRow.epoch
                    curData = cRow.data
                    curDataValue2 = cRow.dataValue2                                     
                    counter += 1


                    #Insert a new row into the SDE table with the current DIADvisor record's information
                    row = curSde.newRow()
                    row.SENSOR_ID = curSensorId
                    row.EPOCH = curEpoch
                    row.DATA = curData
                    row.dataValue2 = curDataValue2
                    curSde.insertRow(row)
                
                # We're done so perform some variable cleanup
                del row
                del accessRows
                del curSde
                del cRow
                print "Number of record(s) in the DIADvisor database: " + str(counter)
                print "Processing end time: " + str(datetime.now())
        except Exception as e:
                # If an error occurred, print line number and error message
                exc_type, exc_obj, exc_tb = sys.exc_info()
                fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                theMessage = "\n" + 80*"#" + "\n" + 80*"#" + "\n"
                theMessage = theMessage + "DATE/TIME: " + str(datetime.now()) + ":" + "\n"
                theMessage = theMessage + "EXECPTION: " + str(e) + "\n" + "\n"
                theMessage = theMessage + "CALLBACK TRACE: " + "\n"
                theMessage = theMessage + 20*" " + "File: " + str(exc_tb.tb_frame.f_code.co_filename) + "\n"
                theMessage = theMessage + 20*" " + "Line " + str(exc_tb.tb_lineno) + ": " + str(linecache.getline(exc_tb.tb_frame.f_code.co_filename, exc_tb.tb_lineno))
                theMessage = theMessage + 20*" " + "Exception Type: " + str(exc_type)
                print theMessage
                logger.error(theMessage)
else:
        sys.exit()
0 Kudos
16 Replies
JakeSkinner
Esri Esteemed Contributor

Hi Steve,

One way to speed up your script is to use the data access module (arcpy.da).  ArcGIS 10.1 introduced this and the performance is much faster than the previous search/insert/update cursors.

BlakeTerhune
MVP Regular Contributor

I second this. Use the arcpy.da module for cursors and use them in a with statement so the cursors are automatically deleted even if there was an error. Your current code may not delete the cursors if there was an error.

0 Kudos
SteveCole
Frequent Contributor

Blake T‌ I have switched my code base over to using the da module and it runs successfully. Can you point me to a code sample that illustrates using the "with" statement?

0 Kudos
BlakeTerhune
MVP Regular Contributor

There are some basic examples in the Esri help documentation

ArcGIS Help 10.2 - SearchCursor (arcpy.da)

For your particular code, it's as simple as just starting with the cursors in nested with statements and removing the del lines afterwards.

with arcpy.SearchCursor(r"C:\gishome\tasks\flood_warning_system\jetConnectForDvLive.odc\last3days") as accessRows:
    with arcpy.InsertCursor(r"C:\gishome\tasks\flood_warning_system\_SPW_GDBMGR@GIS_PW_SWM.sde\tblGageData") as curSde:
        for cRow in accessRows:
            # Loop through the results returned via the OLE DB connection
            ##

            #Insert a new row into the SDE table with the current DIADvisor record's information
            ##

print "Number of record(s) in the DIADvisor database: " + str(counter)
print "Processing end time: " + str(datetime.now())
SteveCole
Frequent Contributor

Gotcha. Thanks!

0 Kudos
ChristianWells
Esri Regular Contributor

In your script it appears you are deleting ALL records from the SDE table prior to the insert. Is the goal to remove all the records prior to the insert cursor?

If this is such the case, I would not recommend using the Delete Rows command, but rather the Truncate Table command. In SQL, truncate is much more efficient than delete. However, this requires that your table is not registered as versioned.

SteveCole
Frequent Contributor

Christian Wells

So, I was making the change from deleteRows to truncate and got the following message:

    

     "ERROR 001400: Only the data owner may execute truncate."

I'm not quite sure what to do about this. I was running the script manually but using the same db connection string that our automated task uses. How do I overcome this?

0 Kudos
ChristianWells
Esri Regular Contributor

Hi Steve, we would just need to use the account of the data owner. In ArcCatalog, you may see the feature class displayed as "SDE.tblGageData". In this case, "SDE" would be the data owner. To overcome this, the username from the database connection would need to match.

SteveCole
Frequent Contributor

So, the connection string to SDE is using operating system authentication instead of database authentication. Would I need to use db authentication instead? And then, which account? SDE?

0 Kudos