Select to view content in your preferred language

Query Oracle Table -- Need advice

3830
5
02-08-2013 04:15 AM
GlenReid
Deactivated User
Task: query a very large Oracle table, build an XY Event layer, apply symbology with .lyr file, create GP service, and output to web app using web API.

I have this running using ModelBuilder at AGS v10 in a GP service and a JSAPI web app.  I am now trying to upgrade to Python and a Python Toolbox.

ArcGIS Desktop 10.1 SP1
ArcGIS Server Linux 10.1 SP1
ArcSDE 10 SP2
Oracle 11.2.0.2

My biggest question is how best to query the data in Oracle (I'm looking for performance).

I use TableSelect_analysis through an ArcSDE connection file in my older ModelBuilder GP service, but I only want 4 columns from this table and it has 21.

MakeTableView_management through an ArcSDE connection file doesn't honor my where clause or my fieldinfo (ESRI Support is looking into this).

I can execute my SQL query with ArcSDESQLExecute, but I'm not sure how to handle the list to use with MakeXYEventLayer_management.

I have also read that using cx_Oracle is a good way to go.

I was hoping that you gurus out there might give me some advice.

Thanks,
Glen
Tags (2)
5 Replies
JamesCrandall
MVP Alum
Glen,

I have several Geoprocessor implementations that use cx_Oracle to pull in attribute data for processing and it works very well for us.  The datasets that must be processed are large (15 minute sample collections over several years), so since we can issue SQL thru the cx_Oracle we only return the results that are needed over each iteration (date).  That is, instead of having to chug through an entire huge dataset, we simply return what is needed, process it, then move on to the next result that is needed.

The trickiest part to get right was making the cx_Oracle cursor (the resulting rows from the SQL) compatible with ESRI-related data.  Specifically, this means we must build an empty data container (table) on the GIS side of things and then populate it with the results in the cx_Oracle cursor.  This of course means it is a field mapping issue, but I think all-in-all it really isn't that difficult to get right.  Once you get the result into an ESRI table format of your choosing, you can then process your XY Event theme as normal.

Here is an example of doing this field mapping and populating the in_memory table:

         gp.createtable_management("IN_MEMORY", "TempDT", "", "")
           
         cxRows = cursor.fetchall()  #the cursor here was already filled.
         rowcount = cursor.rowcount
         for row in range(1, rowcount - 1):
 
  tables = gp.ListTables()
  for tbl in tables:   
     if tbl=="TempDT":
                               ### add the fields
        for i in range(0, len(cursor.description)):
   val1 = str(cursor.description[0])
   val2 = str(cursor.description[1])
   val3 = str(cursor.description[2])

   if val2=="<type 'cx_Oracle.STRING'>":
      fldType = "Text"
      val3 = cursor.description[2]
      gp.AddField(tbl, str(cursor.description[0]), fldType, val3)


   if val2=="<type 'cx_Oracle.NATIVE_FLOAT'>":
      fldType = "Float"
      gp.AddField(tbl, str(cursor.description[0]), fldType)


   if val2=="<type 'cx_Oracle.DATETIME'>":
      fldType = "Date"
      gp.AddField(tbl, str(cursor.description[0]), fldType)

                         ### now populate the table
     insRows = gp.InsertCursor(tblNew)
     for cxRow in cxRows:
        insRow = insRows.newRow()

        for i in range(0, len(cursor.description)):
     insRow.setvalue(str(cursor.description[0]), cxRow)
        insRows.insertRow(insRow)



         cursor.close() 
         db.close()
GlenReid
Deactivated User
Thanks for the reply James.  I will definitely give cx_Oracle a try.

Here is what I am doing with ArcSDESQLExecute:

import arcpy, os
from arcpy import env
import sys
import time

try:
    t0 = time.clock()
    env.workspace = r"C:\ags\python"
    sdeConn = arcpy.ArcSDESQLExecute(r"C:\ags\gp\gisselect_cdodb_t.sde")

    sql = "SELECT lon, lat, snow FROM gis.snow_sum_partitioned WHERE date_time = to_date('19991231','yyyymmdd') AND snow IS NOT NULL"

    try:
        # execute SQL statement
        sdeReturn = sdeConn.execute(sql)
    except Exception, ErrorDesc:
        print ErrorDesc
        sdeReturn = False

    # if return value is a list (a list of lists), display each list as a row from the table being queried.
    if isinstance(sdeReturn, list):

        # set spatial reference
        prjFile = "geo_wgs84.prj"
        spatialRef = arcpy.SpatialReference(prjFile)

        tmpWorkspace = "in_memory"
        outXYfc = "XY_FeatureClass"

        # create in_memory empty feature class
        arcpy.CreateFeatureclass_management(tmpWorkspace, outXYfc , "POINT", "", "", "", spatialRef)

        # build tmp FC path and add new field for values to interpolate
        tmpFC = os.path.join(tmpWorkspace, outXYfc)
        arcpy.AddField_management(tmpFC, "Snow", "DOUBLE")

        # create insert cursor
        inCur = arcpy.InsertCursor(tmpFC)

        for rec in sdeReturn:
            pnt = arcpy.Point(rec[0], rec[1])
            row = inCur.newRow()
            row.Shape = pnt
            row.Snow = rec[2]
            inCur.insertRow(row)

        del inCur, row

        # Set layer symbology
        symbology_layer = "snow_symbology.lyr"
        arcpy.ApplySymbologyFromLayer_management(outXYfc, symbology_layer)

    else:
        # If the return value was not a list, the statement was most likely a DDL statment. Check its status.
        if sdeReturn == True:
            print "SQL statement: " + sql + " ran sucessfully."
        else:
            print "SQL statement: " + sql + " FAILED."

except Exception, ErrorDesc:
    print Exception, ErrorDesc
except:
    print "Problem executing SQL."

0 Kudos
JamesCrandall
MVP Alum
Not sure if this will help or hurt, but instead of Createfeatureclass_management like you are doing, why not create a table (CreateTable_management) and use your sdeReturn to populate it?  Then use that table in the arcpy.MakeXYEventLayer_management, which a table is the parameter it is asking for.

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00170000006z000000

Edit: forgot --- this would mean you don't need cx_Oracle at all.  How is the performance of ArcSDESQLExecute?
0 Kudos
GlenReid
Deactivated User
James,

As far as performance with ArcSDESQLExecute goes, my entire script consistently takes ~7.5s while the query is returned in ~2.4s (~0.8s for the connection, ~1.6s for the query) -- while changing dates to avoid query caching as well as to use different table partitions.

The time hog is looping through the records to create the featureclass (~4s).  For this reason, I'll look into using CreateTable_management and MakeXYEventLayer_management to see if going that route improves performance.

Thanks for the suggestion.
Glen
0 Kudos
JamesCrandall
MVP Alum
Glen -- something that completely slipped my thought process for this thread:

What if your Database repository is NOT ArcSDE?

This is where and why cx_Oracle is needed for our implementation.  That is, we are integrating non-spatial Oracle attirbute repositories (ancillary to spatial databases), joining this data in the in_memory space to "make it" spatial data, then processing it from there.

Hope that makes sense.  Again, if your attributes are stored in SDE then your ArcSDESQLExecute is probably a good choice --- I really have no benchmark to comment on performace differences between this and cx_Oracle.  I'd gather if I moved my SQL into procedural packages on the database server I could really boost things up (but even that may be negligable gains).

Anyway -- good luck!
0 Kudos