Mark duplicate features in attribute table with Field Calculator, update to Arc10

1798
13
09-05-2011 06:39 PM
ChrisWoodward1
New Contributor III
I've got a handy piece of VB script that I used to routinely run on Arc9.3 to mark duplicate features in an attribute table based upon unique ID's in the table.  I typically use this on repeat points that appear within a dataset more than once (based upon ID value, Date/Time and position).  The problem is, the script won't work in Arc10 and I wasn't the original author so I'm not sure how to update it. 

I'm running Arc10 SP2 on a Windows 7 and/or XP machine.  My typical workflow is to concatenate a unique ID 'Unique_ID' (text, 255) based upon several attributes (say ID, Date/Time, Lat./Long).  I then create a attribute called 'dup' of a short interger type.  I run the field calculator on the dup field and load the following script and what it would normally do is to mark duplicate values of the Unique_ID value with a 1 value.  Both records wouldn't be marked, only the second or greater duplicate Unique_ID value record. 

Within the field calculator, I'd have the "Show Codeblock" checked and the following in the "Pre-Logic Script Code" area:

Static d As Object
Static i As Long
Dim iDup As Integer
Dim sField
'========================
'Replace the field name bellow
sField = [Chris_ID]
'========================
If (i = 0) Then
  Set d = CreateObject("Scripting.Dictionary")
End If
If (d.Exists(CStr(sField))) Then
  iDup = 1
Else
  d.Add CStr(sField), 1
  iDup = 0
End If
i = i + 1

The following would also be in the bottom part: Dup=iDup

If someone could help me with updating this VB script for Arc10, it would be greatly appreciated!

Thanks,
Chris
Tags (2)
0 Kudos
13 Replies
StacyRendall1
Occasional Contributor III
Here is a Python solution. At the moment it returns None for the first value (although you can change this to 0 if you want) and it counts up the duplicates, so the first duplicate is 1, the second is 2 and so on (but you can change that too - just make the line read return 1 instead of return d[inVal])
Pre-Logic Script Code:
d = {} # creates an empty dictionary the first time

def findDuplicates(inVal):
try:
  d[inVal] += 1 # works only if this value has been encountered before (is a valid key)
  return d[inVal] # returns the number, you could just make this return 1
except KeyError:
  d[inVal] = 0 # if this is the first time, highlight it
  return None # or 0...

Expression: dup =
findDuplicates(!Unique_ID!)
0 Kudos
ChrisWoodward1
New Contributor III
Stacey,

Thanks so much for your help.  I've tried running the script in the field calculator but get an error message "A field name was not found or there were unbalanced quotation marks".  Am I doing something wrong?  I've copied your script precisely as you have it here for the Pre-Logic Script Code and the Expressionn dup = portion of the script.  Thanks.
0 Kudos
MarcNakleh
New Contributor III
Stacy: Just thinking about it, I think a more Pythonic way would be to use a dictionary's built-in setdefault method, which returns a default value if the key isn't found (which works well with cwoodward's issue, I think.)

Pre-Logic Script Code:
d = {} # creates an empty dictionary the first time

def find_duplicates(in_val):
d[val] = d.setdefault(val, -1) + 1
return d[val]

Expression: dup =
find_duplicates(!Unique_ID!)


Though I find yours more legible!

cwoodward's : I think you need to replace the expression:
findDuplicates(!Unique_ID!


with the expression:
findDuplicates(!Chris_ID!)


if the column with the values you are checking for duplicates is called 'Chris_ID'. In the Python Field Calculator, values surrounded by exclamation marks refer to column names (just like brackets [] in VB code.)

Hope this helps!
0 Kudos
StacyRendall1
Occasional Contributor III
Marc, that's impressive! I haven't come across setdefault before, but I can see where it might be handy.

Just noticed that on the module line you had in_val, but the rest of the module refers to val. cwoodward, this should work (see attached picture for where to put things; resultsField is the field the results should go in, numerator is the field where the duplicates are):
Pre-Logic Script Code:
d = {} # creates an empty dictionary the first time

def find_duplicates(val):
d[val] = d.setdefault(val, -1) + 1
return d[val]

Expression: dup =
find_duplicates(!Unique_ID!)
0 Kudos
MarcNakleh
New Contributor III
Hi Stacy,

Thanks for the correction!
0 Kudos
GeorgeNewbury
Occasional Contributor
Thanks for posting that. I modified it to find duplicate points based on same exact X,Y:

Pre-Logic Script Code:
d = {} # creates an empty dictionary the first time

def find_duplicates(valX,valY):
d[(valX,valY)] = d.setdefault((valX,valY), -1) + 1
return d[(valX,valY)]

Expression: dup =
find_duplicates( !SHAPE.firstpoint.x!, !SHAPE.firstpoint.y! )


I used the 'Feature Vertices to Points' tool, but it produced duplicate points at the common vertices. I couldn't find an out of the box tool to filter them.
0 Kudos
RobBlash
Regular Contributor
Is there a way to modify this script to simply select the features that are duplicated? It would be helpful to locate dups without the need to add a new field.

Thanks,

Rob
0 Kudos
MarcinGasior
Regular Contributor
Maybe Find Identical tool will help?
0 Kudos
RobBlash
Regular Contributor
Thanks, I'll take a look. I was hoping to come up with a solution available at all license levels.

Pre-10 we had a button to bring up a form, choose the field, and select duplicates. I'd like to get that functionality back in 10.1 but of course without VBA.
0 Kudos