Processing time with Hydrology  tools in standalone Python script vs. ArcCatalog

Discussion created by tburley on Apr 11, 2011
Latest reply on Apr 12, 2011 by tburley
My question is: what might cause the ArcGIS 9.3 Hydrology Toolbox Snap Pour Point and Watershed tools to run *significantly* slower in a
Python script versus manually running the tools in ArcCatalog (both runs using the exact same input data files and specification parameters)??

I'm running this script from the windows command line so neither ArcMap or ArcCatalog are open; using ArcGIS 9.3.1 on a Windows XP machine with 4 GB of RAM and
plenty of processor power (a Dell Precision machine); and all input data are local.

I have pre-processed Flow Accumulation, Flow Direction, and a Feature Class of water-quality sample sites to be used as pour points, and all
datasets are in NAD83 UTM 17N projection. The Flow Accum and Flow Dir data files have spatial extents based on Hydrologic Unit Code (HUC) eight digit boundaries.
The script iterates through each record in the sample site feature class, determines which eight digit HUC the point falls in,
selects and exports the site record as a new feature class in a scratch geodatabase, and then handles the watershed processing using that exported point feature class.

I've stripped out comments and some other calls to a "processing log timestamp" function for reporting tool run times just to focus here on the question
at hand. The biggest time sinks based on the processing log reporting are the Snap Pour Point tool and Watershed tool.
We're talking on the order of 4+ minutes per site using a subset of all the sites for testing, and then 3-4 minutes for each site with the Watershed tool.

However, I can take the same single point feature class and the same flow accumuluation and flow direction rasters that the Python script is using,
and run the Snap Pour Point and Watershed tools manually in ArcCatalog in approx 1 minute or less for each site (the resulting watersheds are the same as
the watersheds the script produces). Example: one of the sites I manually ran the Snap Pour Point tool  in 58 seconds via ArcCatalog, while the Python
script executing that tool took 3 minutes 59 seconds for the tool to complete. I then ran the Watershed tool manually for the same site and it
took 1 minute 1 second in ArcCatalog, while the Python script executing the Watershed tool took 4 minutes 8 seconds for the tool to complete.
I'm using a 50 meter buffer for the Snap Pour Point tool in both the Python script and during my ArcCatalog manual run comparisons.

As you can see, CONSIDERABLY slower in the Python script. I've hit a wall with what I can figure out, so much appreciate any suggestions or insight.


 if hucGdb in gdbFiles:
   outScratch = scratch + '\\' + delinSiteID + 'temp'
   flowAccum = filePath + '\\' + hucGdb + '\\' + 'FA_' + huc
   flowDir = filePath + '\\' + hucGdb + '\\' + 'FD_' + huc
   tempEnvironment = gp.extent
   gp.extent = flowAccum
   clause = '[DELIN_ID] = ' + "'" + delinSiteID + "'"
   gp.Select_analysis(delinSites, outScratch, clause)
   gp.extent = tempEnvironment

   outSnapRaster = snapPoint + '\\' + 'p' + delinSiteID

   gp.SnapPourPoint_sa(outScratch, flowAccum, outSnapRaster, tolerance)
   gp.extent = tempEnvironment
   outSnapFeature = snapPoint + '\\' + 'p' + delinSiteID + '_snap'
   gp.RasterToPoint_conversion(outSnapRaster, outSnapFeature, "VALUE")

   gp.AddField_management(outSnapFeature, "ORIGINAL_SITE_ID", "TEXT", "#", "#", "20")
   gp.CalculateField_management(outSnapFeature, "ORIGINAL_SITE_ID", '"' + realID + '"')

   gp.AddField_management(outSnapFeature, "DELIN_ID", "TEXT", "#", "#", "20") 
   gp.CalculateField_management(outSnapFeature, "DELIN_ID", '"' + delinSiteID + '"')
   outWsRaster = scratch + '\\' + delinSiteID
   tempEnvironment = gp.extent
   gp.extent = flowDir
   gp.Watershed_sa(flowDir, outSnapRaster, outWsRaster)
   gp.extent = tempEnvironment
   outWsFeature = basinOutput + '\\' + delinSiteID

   gp.RasterToPolygon_conversion(outWsRaster, outWsFeature, "SIMPLIFY")
   gp.AddField_management(outWsFeature, "DELIN_ID", "TEXT", "#", "#", "25") 
   gp.CalculateField_management(outWsFeature, "DELIN_ID", '"' + delinSiteID + '"')
   gp.AddField_management(outWsFeature, "ORIGINAL_SITE_ID", "TEXT", "#", "#", "25")
   gp.CalculateField_management(outWsFeature, "ORIGINAL_SITE_ID", '"' + realID + '"')

  msg = """Site ID """ + realID + """ does not have a geodatabase of
  data, moving onto the next site"""
  print msg
  logFile.write(msg + '\n')
print "ARCGISSCRIPTING ERROR: " + gp.GetMessages(2)
msg = "ARCGISSCRIPTING ERROR: " + gp.GetMessages(2)
logFile.write(msg + '\n')
print "ERROR: " + ErrorDesc.message
msg = "ERROR: " + ErrorDesc.message
logFile.write(msg + '\n')

row = rows.next()