Table compare output to variable or in-memory file

StephanieSnider · ‎08-02-2012

I'd like to have the table compare function output to a in-memory file or variable instead of an actual text file. Is that possible?

I tried using a table variable, but that errored. I used a string variable and the process worked, but the string was empty. Any suggestions?

This works:
arcpy.TableCompare_management(table1, table2, 'OBJECTID', 'ALL', '#', '#', '#', 'NO_CONTINUE_COMPARE', r'C:\Temp\compare_table2.txt')

This works, but the string is empty:
output_compare_string = ''
arcpy.TableCompare_management(table1, table2, 'OBJECTID', 'ALL', '#', '#', '#', 'NO_CONTINUE_COMPARE', output_compare_string)

ChristopherThompson · ‎08-03-2012

I'd like to have the table compare function output to a in-memory file or variable instead of an actual text file. Is that possible?

I tried using a table variable, but that errored. I used a string variable and the process worked, but the string was empty. Any suggestions?

This works:
arcpy.TableCompare_management(table1, table2, 'OBJECTID', 'ALL', '#', '#', '#', 'NO_CONTINUE_COMPARE', r'C:\Temp\compare_table2.txt')

This works, but the string is empty:
output_compare_string = ''
arcpy.TableCompare_management(table1, table2, 'OBJECTID', 'ALL', '#', '#', '#', 'NO_CONTINUE_COMPARE', output_compare_string)

It looks like that tool really wants to write a table to disk - you could always write it to a scratch workplace, then make a table view from that and work with the view as a work around. Eventually you'll want to delete the contents of your scratch workspace to keep things clean - you could probably do it right away, as once the view is created I think you don't need the underlying table on disk (i could be wrong about this so double check). Since the output of this could have many rows in it, treating the output as a string might be troublesome to work with anyway.

StephanieSnider · ‎08-03-2012

I was thinking that having the output in memory or in a variable might make the script run faster than having to read a file from the hard drive. I'm using the table compare and featureclass compare functions to run a backup script for an ArcSDE fail over (mirror backup). If differences are found, the old file is removed and the newer file is added. Well, its more complicated than that, but you get the idea. When I went to the ESRI conference last week, an ESRI presenter spoke briefly about using in-memory data but it was in reference to Oracle trace files. I guess I was hoping I could do the same thing in python. I only need to scan the text for the word "false" to check for a difference in the table. If there is a difference, then the update runs.

StephanieSnider · ‎08-06-2012

Oops. I meant - if the file has the word "True" in it, then there is a difference between tables.

KimOllivier · ‎08-07-2012

If you were a keen scripter you could probably write a much faster change detector in Python using the SET datatype.
You have to load the sets of course, using a SearchCursor and creating a dictionary using the key. This takes a few seconds, but the result is definitely in memory.
The set comparison takes milliseconds.
Then you have to painfully write out the difference records using a layer. Depending on how many changes this can take a few minutes, or in your case, an update script can loop through the list.

How large are your tables, and how many differences do you have?

For an idea, I do a comparison of two featureclasses of polygons containing 2.45 million records.
Last month there were 11616 changes, 1516 deletions and 2270 new records which are written out to new featureclasses.
It takes around 40 minutes to run.

Here is another example.
When creating layer definition views you MUST index by the key for the SQL query to work in human time.

#NAME:  buildDeltaLYRs.py
#AUTHOR:  Kevin Bell
#EMAIL:  kevin.bell@slcgov.com
#DATE:  20071207
# edited by KimO
#PURPOSE:  create adds/deletes layer files by comparing 2 point
#          feature classes shape and attributes. If the shape has
#          not changed, but any of the attributes have, the feature
#          will show as a delete, and an add.

#RECOMMENDED SYMBOL: adds.lyr = green plus, deletes.lyr = red X
#                    (this allows for nice stacking)

#NOTE:  __buildDict method has hard coded primary key and attribute names.

 

#XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

def buildDeltaLayers(inFC1, inFC2):
    '''build an adds and deletes lyr for a given chrono FC '''
    d1 = __buildDict(inFC1)
    d2 = __buildDict(inFC2)
    compareList = __valuesChanged(d1,d2)
    __makeLYR(inFC1, compareList, "deletes")
    print "...deletes.lyr is complete."
   
    d1 = __buildDict(inFC2)
    d2 = __buildDict(inFC1)
    compareList = __valuesChanged(d1,d2)
    __makeLYR(inFC2, compareList, "adds")
    print "...adds.lyr is complete."

def __valuesChanged(dict1, dict2):
    '''get a list of keys from one dict if a cooresponding dict's values are different '''
    outList = [key for key in set(dict1.keys() + dict2.keys()) if dict1.get(key) != dict2.get(key)]
    return outList

def __buildDict(inputFC): #-----BEWARE OF HARDCODED PRIMARY KEY AND ATTRIBUTES BELOW!!!!!
    '''Build a dictionary of the primary key, and it's fields'''
    d = {}
    cur = gp.SearchCursor(inputFC)
    row = cur.Next()
    while row:
        d[row.FP] = [row.Shape.Centroid, row.LUMENS, row.WATTS, row.TYPE, row.OWNER, row.RATE]
        row = cur.Next()
    del cur
    return d

def __makeLYR(fc, inList, outLyrName):# BEWARE OF HARDCODED PRIMARY KEY BELOW
    '''given a list, return a LYR file'''
    wc = str(tuple(inList))
    whereclause = "FP IN " + wc # <----IF DATA ISN'T FILE GDB, YOU MAY NEED QUOTES/BRACKETS
    gp.MakeFeatureLayer_management (fc, outLyrName, whereclause)
    gp.SaveToLayerFile_management  (outLyrName, outLyrName +".lyr")

#XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

print "-----------  building adds/deletes layer files  ----------------"

import arcgisscripting, time
startTime = time.clock()

gp = arcgisscripting.create()
gp.workspace = r"F:\Temp\streetLightCompare.gdb"# <----BEWARE OF HARDCODED WORKSPACE !!!!!
gp.OverwriteOutput = 1

#o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
buildDeltaLayers("SEP07","OCT07") #<-------------------BEWARE OF HARDCODED FEATURE CLASSES !!!!!
#o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o

print "Your adds/deletes lyr's are in:"
print str(gp.workspace)

del gp

print "--------------------------------------------------------------------------------------------"
stopTime = time.clock()
elapsedTime = stopTime - startTime
print "elapsed time = " + str(round(elapsedTime, 1)) + " seconds"

StephanieSnider · ‎08-07-2012

Wow Kim! Thanks! I have hundreds of feature classes ranging from 10 records to 36000 records. Then feature datasets, rasters and tables. Unfortunately this fancy coding is quite over my head, but I'll save it for the future when I'm a more advanced scripter. I appreciate the information!