Select to view content in your preferred language

Compare Feature and attribute

10004
11
10-03-2011 10:33 PM
AshutoshGosavi
Emerging Contributor
Q1 I have a small question.

I want to compare two shape files with attribute and shape feature.
After the comparison is done it should display the miss match in attribute and shape features.
Is there a inbuilt tool in Arcgis or does it need to be scripted.

Regards
Ashutosh
Tags (2)
0 Kudos
11 Replies
StephanieWendel
Esri Contributor
There is a built in tool that will do this for you at any license level in version 10. It's called the Feature Compare tool. This sounds exactly what you described wanting.

For more information, check out the tool's description found here.
0 Kudos
LouJacoby
New Contributor
I tried feature compare, but it seems that if I try to omit objectid, it doesn't work.  Instead, when comparing features from two feature classes, if it sees the same feature having different objectids it flags those as being different.  In my situation both features classes are generated independently (on different dates) so the same features almost never have the same objectid.  This seems like a major bug.

Lou
0 Kudos
AlanToms
Regular Contributor
I'm looking to do something similar.  The problem with the Feature Compare tool is it uses the ObjectID as the unique identifier; there is not an option to select which field contains the unique identifier.

Alan
0 Kudos
StephanieSnider
Frequent Contributor

Did you ever find a solution to ObjectID being the default unique ID for a feature compare?  I'm using ArcGIS 10.2.2 and it still does this.

0 Kudos
ManuelNaranjo3
Deactivated User

Hi Alan,

I'm working on something similar using the Feature Compare tool. I'm having the same problem were the tool is using the ObjectID as the unique identifier. Did you ever find a solution to select your preferred unique identifier?

Manuel

0 Kudos
KimOllivier
Honored Contributor

The Feature Compare  tool has clearly been designed to solve a very narrow problem and is not suitable for many change detection problems. There is no chance that the tool will be changed to widen the scope because it will break backward compatibility. It is very hard to write a generalised tool -try it!

The simple solution it to write your own. It will be faster and better because Python has excellent structures to do this. I posted my own customised change tool using a user-id as an example below that shows how to use a dictionary to handle a custom primary user key that does not need to be indexed in the source table. This is because a dictionary has it's own hash key that is very efficient and fast.

0 Kudos
NicholasO_Connor
Emerging Contributor

Hi Kim

You wrote..."I posted my own customised change tool using a user-id as an example below that shows how to use a dictionary to handle a custom primary user key that does not need to be indexed in the source table"

I do not see this example - could you post it again? Much appreciated.

0 Kudos
KimOllivier
Honored Contributor

Re: Compare Feature and attribute

I thought everyone can see all the replies to this thread. Is this why people have been asking the same question?

0 Kudos
KimOllivier
Honored Contributor
The built-in compare tool seems to be designed to check two featureclasses after a copy and edit.
I cannot see the point of stopping after find the first difference without saying which one.

What we really need is a compare using a static key (which used to be called the user-id).

The fastest way to do this uses the Python Set operations which run in milliseconds.

But first you have to create the set for each of the featureclasses which takes some time with a SearchCursor.
This only takes a few seconds.

Finally you need to create a selection or new featureclass containing the deletes/adds/changes.
This is easy enough but a bit slower, taking perhaps minutes. Use an SQL query using IN (list_of_keys)

You need to keep track of the primary key for each feature. Do this using a Dictionary keyed with the key and containing all the attributes that you want to compare. Comparing shapes is difficult because it requires them to be in a standalone object so you have to unpack the shape field into a Python string. Instead use the area or length as a proxy for the shape and round it to a suitable precision eg int(round(row.shape.area)*100))
#NAME:  delta_road.py
#AUTHOR:  Kevin Bell
#EMAIL:  kevin.bell@slcgov.com
#DATE:  20071207
# edited by Kim Ollivier

#PURPOSE:  create adds/deletes layer files by comparing 2 point
#          feature classes shape and attributes. If the shape has
#          not changed, but any of the attributes have, the feature
#          will show as a delete, and an add.

#NOTE:  buildDict method has hard coded primary key and attribute names.
# mods :
# use sets for del/add
# write out featureclass differences
# do not use centroid for lines, they have changed in a year! suggest length
# do not use shape for non-points
# dont forget to index keys for definition layers

def buildDeltaLayers(inFC1, inFC2):
    '''build an adds and deletes lyr for a given chrono FC '''
    print inFC1, inFC2
    d1 = buildDict(inFC1)
    d2 = buildDict(inFC2)

#   find set differences and intersections to reduce dictionary comparisons
    startTime = time.clock()
    s1 = set(d1.keys())
    s2 = set(d2.keys())
    sNew = s2 - s1
    print "New",len(sNew)
    sDel = s1 - s2
    print "Del",len(sDel)
    sInt = s1.intersection(s2)
    print "Int",len(sInt)
    print "sets done"
    stopTime = time.clock()
    elapsedTime = stopTime - startTime
    print "elapsed time = " + str(round(elapsedTime, 1)) + " seconds"

    # could weed the dictionaries but will that be slower than just using the intersection set?
    compareList = valuesChanged(d1,d2,sInt)
    print "Changes",len(compareList)
    stopTime = time.clock()
    elapsedTime = stopTime - startTime
    print "elapsed time = " + str(round(elapsedTime, 1)) + " seconds"
    changes = len(compareList)
    if changes > 0 and changes < 50000:
        print "Changes to be written",changes
        makeLYR(inFC2, compareList, "rdchg")
        
    else :
        print "no changes, or too many to be believed",changes
    # must create delete layer first
    # to allow new features to be filtered for ID renumbering
    if len(sDel) > 0 :
        makeLYR(inFC1, sDel,"rddel")
    if len(sNew) > 0 :
        makeLYR(inFC2, sNew,"rdnew")
    

def valuesChanged(dict1, dict2,sBoth):
    '''get a list of keys from one dict if a corresponding dict's values are different'''
    ##    outList = [key for key in set(dict1.keys() + dict2.keys()) if dict1.get(key) != dict2.get(key)]
    outList = [key for key in sBoth if dict1.get(key) != dict2.get(key)]

    return outList

def buildDict(inputFC): #-----BEWARE OF HARDCODED PRIMARY KEY AND ATTRIBUTES BELOW!!!!!
    '''Build a dictionary of the primary key, and its fields'''
    startTime = time.clock()
    d = {}
    cur = gp.SearchCursor(inputFC)
    row = cur.Next()
    while row:
        # only need to check primary keys and shape
        # oops row.shape.centroid always fails - why?? because its only an object reference
        # you have to get out the coordinates and put it in a real Python object
        d[row.GetValue(pk)] = [row.name.upper().strip() ]#, round(row.shape.length)]
        # d[row.GetValue(pk)] = [round(row.shape.length)]
        row = cur.Next()
    del cur
    print inputFC,
    stopTime = time.clock()
    elapsedTime = stopTime - startTime
    print "dict created, elapsed time = " + str(round(elapsedTime, 1)) + " seconds"
    return d

def makeLYR(fc, inList, outLyrName): # BEWARE OF HARDCODED PRIMARY KEY BELOW
    '''Given a list, return a LYR file'''
    
    startTime = time.clock()
    wc = str(tuple(inList))
    print outLyrName,len(inList)
    whereclause = pk+" IN " + wc # <----IF DATA ISN'T FILE GDB, YOU MAY NEED QUOTES/BRACKETS
    # print whereclause
    gp.MakeFeatureLayer_management (fc, outLyrName, whereclause)
    print outLyrName,"ORIG layer count",gp.GetCount(outLyrName).GetOutput(0)
    # remove changes of just RCL_ID
    if outLyrName == 'rdnew':
        print "RDDEL layer count",gp.GetCount("rddel").GetOutput(0) 
        gp.SelectLayerByLocation_management(outLyrName, "WITHIN", "rddel","","REMOVE_FROM_SELECTION")
        print "Made NEW layer",outLyrName,round(time.clock() - startTime)," seconds"
        print outLyrName,"NEW layer count",gp.GetCount(outLyrName).GetOutput(0)
    gp.RefreshCatalog(delta_gdb)
    print outLyrName,"layer count",gp.GetCount(outLyrName).GetOutput(0)
    layerfn = os.path.dirname(delta_gdb)+"/"+outLyrName +".lyr"
    if gp.Exists(layerfn) :
        gp.Delete(layerfn)
    
    ##gp.SaveToLayerFile_management(outLyrName, layerfn)
    # print "saved layer",round(time.clock() - startTime)
    deltaFC = delta_gdb+"/"+outLyrName
    # Warning, featureclass MUST be indexed on PK!
    if not gp.Exists(fc) :
        print fc,"not found"
    ## gp.AddIndex_management(fc,pk,fc+"_"+pk+"_idx")
    if gp.Exists(deltaFC):
        gp.Delete(deltaFC)
    gp.CopyFeatures_management(outLyrName,deltaFC)
    stopTime = time.clock()
    elapsedTime = stopTime - startTime
    print outLyrName,"elapsed time = " + str(round(elapsedTime, 1)) + " seconds"
#----------------------------------------------------------------------

print "-----------  delta road  ----------------"

import arcgisscripting, time,os,sys

gp = arcgisscripting.create(9.3)
gp.overwriteoutput = True

try :
    last_gdb = sys.argv[1]
    current_gdb = sys.argv[2]
    delta_gdb = sys.argv[3]
except :
    last_gdb    = "E:/road_jan2009.gdb"
    current_gdb = "e:/crs/mobile/mobile.gdb"
    delta_gdb   = "E:/crs/road_roadname.gdb"

if not os.path.exists(delta_gdb) :
    gp.CreateFileGDB_management(os.path.dirname(delta_gdb), os.path.basename(delta_gdb))
gp.Workspace = delta_gdb
gp.OverwriteOutput = 1
print

for laydef in [["road","RCL_ID"]] :
    layer = laydef[0]
    pk = laydef[1]
    startTime = time.clock()
    print "Start"
    old = last_gdb+"/"+layer
    new = current_gdb+"/"+layer
    ready = True
    if not gp.Exists(old) :
        print old,'not found'
        ready = False
    if not gp.Exists(new) :
        print new,'not found'
        ready = False
    if ready :   
        buildDeltaLayers(old,new) 
        msg = "Your del/change/add fc's are in:"
        gp.AddMessage(msg)
        print str(gp.workspace)
        gp.AddMessage(str(gp.workspace))
        stopTime = time.clock()
        elapsedTime = stopTime - startTime
        print layer,"total elapsed time = " + str(round(elapsedTime, 1)) + " seconds"
    else :
        print "not ready"
# del gp

print "-------------------------------------------------------------"
gp.AddMessage("finished")