Deleting "similar" records in attribute table

3797
3
Jump to solution
06-01-2016 07:14 PM
BrianLin2
New Contributor

I have a feature class where I have used the near table tool to calculate the distance for each feature to all other features (i.e. Feature 563 to Feature 1174/1140 in the attached picture). However, in the process it has created "similar" records where the distance is calculated again using the same features but in "reverse" (i.e. the highlighted records such as Feature 192 to 204 and its "reverse" Feature 204 to 192). I would like to know if it is possible to delete such records using arcgis's geoprocessing tools as far as possible (automated), rather than doing it manually, due to the number of records present in the attribute table (over 200k records).

0 Kudos
1 Solution

Accepted Solutions
DanPatterson_Retired
MVP Emeritus

That was a conceptual example, I guess you didn't realize that it is not a script.  Do the following

with a python parser selected, enter this in the field calculator

"{} {}".format(min([!A!,!B!]), max([!A!,!B!]) )

where A and B are your two field names and field names are surrounded by ! marks

View solution in original post

3 Replies
DanPatterson_Retired
MVP Emeritus

Let's keep it all in ArcMap, using one field calculation and one tool (there are other methods, which I won't discuss unless you want me too)

The trick is, two points may indeed be the closest to one another and no other point is closest to either of them.  But, they obviously don't exist in the same row so you can't do a simple query.  But you can find duplicates using the following method

Add a text/string field

pick your two fields and the values are going to be concatenated together using the following example

let's pick 2 records, 196 and 235... they are closest to one another

>>> closest = 235  # closest is record 196
>>> ObjectID = 196  # closest is record 235
>>> both = "{} {}".format(min([ObjectID,closest]), max([ObjectID,closest]))
>>> both
'196 235'
>>>
>>> # now moving on
>>> closest = 196
>>> ObjectID = 235
>>> both = "{} {}".format(min([ObjectID,closest]), max([ObjectID,closest]))
>>> both
'196 235'
>>>

The logic no enables to key those two fields together and use the Find Identical—Help | ArcGIS for Desktop

and praying you have an advanced license.  If not, then you are going to have to move the result of the concatenation out to numpy using FeatureClassToNumPyArray and remove the duplicates there... I will leave that discussion for later

BrianLin2
New Contributor

I have an advanced license for arcmap, hence I should be able to use the "find identical" tool. However, I have a few queries with regards to the field calculation part. I have never scripted languages, particularly Python before, hence I am sort of confused with regards to the Python script that you have written above. I have attempted to input the Python script in the field calculator with the Python option switched on (at the Parser) section, and made a bit of modifications as seen below:

>>> [Feature_B_Near_FID] = 192  # closest is record 204 

>>> [Feature_A_Input_FID] = 204  # closest is record 192

>>> both = "{} {}".format(min([[Feature_A_Input_FID] ,closest]), max([[Feature_A_Input_FID] ,closest])) 

>>> both 

'192 204' 

>>> 

>>> # now moving on 

>>> closest = 192

>>> Feature_A_Input_FID = 204 

>>> both = "{} {}".format(min([[Feature_A_Input_FID] ,closest]), max([[Feature_A_Input_FID] ,closest])) 

>>> both 

'192 204' 

However, ERROR000539: SyntaxError: invalid syntax  (<expression>, line 1) occurs, so I am a bit confused.

In addition, the script appears to only allow 2 values to be chosen and concatenated together each time. I would like to create a tool using modelbuilder that allows the user to obtain the near distance results easily, hence I am a bit concern that the script would have to be "repeated" countless times to concatenate all the selected values together. Is there a more "automated" process that allows the selection and concatenation of all the values? 

0 Kudos
DanPatterson_Retired
MVP Emeritus

That was a conceptual example, I guess you didn't realize that it is not a script.  Do the following

with a python parser selected, enter this in the field calculator

"{} {}".format(min([!A!,!B!]), max([!A!,!B!]) )

where A and B are your two field names and field names are surrounded by ! marks