Please Help With Find/Delete Identical

3096
6
11-05-2014 01:46 PM
JeffSlauenwhite
New Contributor II

Hi all,

Can someone please help?

I have an SDE feature class of utility poles.  Many (On the order of 10,000) of these pole share duplicate geometry (directly on top of one another), but don't share identical attributes.  I want to remove the duplicates, however running delete identical won't necessarily preserve the attributes of the record I want to keep.

For example I have two poles on top of one another, one pole has unique attributes such as ID, Height or class, the other does not.  How do I tell Arc to delete the pole WITHOUT these unique attributes?  Ideally I'd want to establish a hierarchy that says: if Height is Null then delete, or if pole class is null, delete or vice versa.  I don't want to simply delete the duplicate record, without preserving that pole that has the most relevant information on a series of attributes.  I tried looking at it manually, but there would be thousands of records to sort through.

It sounds like I will need bit more of a complex script to identify and delete the non-unique records.  Time is a factor ...  any resources, code or otherwise would be greatly appreciated.

Thanks

JS

6 Replies
StevenGraf1
Occasional Contributor III

Here's what I would do. First Add XY Coordinates and run the find identical on the XY locations.  Next join the identical records results to the utility pole layer.  Select all the utility poles that are duplicates.  Select from those that have whatever field you want to check for null and delete those features.

Hope this helps you.

Steven

JeffSlauenwhite
New Contributor II

Hi Steven,  thank you that does help.  I had originally started with this method. the only issue is that the selecting of the utility poles part will take ages, because there are thousands of duplicates.  That may be the only foreseeable method, unless theres some sort of script with parameters that will work as well.

Thanks again

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Assuming you mean there are thousands of duplicates across the entire dataset and not thousands of duplicates per pole, even a million record dataset shouldn't take that long if you are identifying duplicates the way described above.  It may be something else is going on in terms of hindering performance.  You may want to check out Resetting your ArcGIS application profile.

0 Kudos
AsrujitSengupta
Regular Contributor III

You may try sorting the records based on some fields and then use the "Delete Identical" tool.

About sorting records in tables:

ArcGIS Help 10.1

Delete Identical (Data Management)

ArcGIS Help 10.1

MatthewLeonard
New Contributor III

This is essentially the solution I came up with. Say I want to delete duplicates with identical values in two fields, and keep the record with the lowest value in a third field. First, I used the Sort tool, specifying all three fields as sort fields, to produce a copy of the data that will have OBJECTID values in the sorted order. Then, run Delete Identical on the new sorted data, specifying the first two fields, so that when duplicates are found, the first one encountered will be kept, which should be the one with lowest value in the third field.

This assumes that OBJECTID is what the tool uses to decide which duplicate record to keep. I am having trouble verifying that for sure, but it seems to be what is happening when I run the tool.

0 Kudos
MatthiasAbele1
New Contributor

Hi Jeff,

do I understand your problem correctly?

You want to eliminate duplicates of these poles, but only if attributes and position are duplicates?

My workflow makes use of the software "GISconnector for Excel".

My suggestion would be the following:

1. Create two colums x and y if not exisiting and calculate the x and y coordinate for your point

2. Export your SDE-Feature Class to Excel with the GISconnector

3. Click into your table, go to tab "data", in the group "data tools" start the fuction "remove duplicates"

4. A menu appears displaying all your columns, activate those columns which should be considered when removing duplicates and press ok; your duplicates are deleted. To consider identical positions, you can check the colums with x and y.

5. Go to the general settings of the GISconnector and disable the security setting: "Do not delete features in ArcGIS which do not exist in Excel, go to the "Edit connection" settings, tab "Options" and allow "Delete Features in ArcGIS which do not exist in Excel" for exactly this connection.

6. Press the button "Transfer all data", the GISconnector will now delete all Features in ArcGIS which do not exist in Excel anymore.

That's it.

Best regards,

Matthias

FYI: I belong to the GISconnectort team; the version 1.0 available as trial on our homepage does not support SDE features. We will release the 1.1 with SDE support very soon. If you would like to try a beta 1.1 with SDE support, contact me using our website.

0 Kudos