I'm working with a few roads and trails datasets. There are some duplicate geometries that are not identical. See photos below.
I tried Find Identical with XY Tolerance, but it didn't seem to find most of these. I tried continuing to increase the XY tolerance, but I just got it finding parallel roads like the roads below.
I even tried planarizing the lines and just got a bunch of segments. I thought about trying to pull them out based on number of intersections, but I'm not even sure how to make that work. Any suggestions would be greatly appreciated!
Select by location using the same file for the selector and target, IF, the duplicates are in the same file.
More importantly, what do you want to do with them?
Delete one?
Average the paths?
How many do you have?
How did they end up in the same file?
I tried the select by location process you mention, but it didn't seem to work. Can you clarify that please?
Yes, I want to remove one. I'm uncertain how many there are. I have four datasets, each with at least 100,000 features, and I've found at least a few by hand in each one, but I definitely don't have the time to inspect all of the feature classes by hand. I'm unsure how they all ended up in the same file. They're all from government agencies, so I didn't create the data.
If the data are in a database that has the function available, you can try filtering based on the Hausdorff Distance. It'll depend a lot on where the data are stored, but it's usually ST_HausdorffDistance or similar.
SELECT a.objectid,
a.shape,
b.objectid,
b.shape
FROM your_table a
LEFT JOIN your_table b ON ST_HausdorffDistance(a.shape, b.shape) < 5
Adjust the value at the end to determine how similar the shapes are allowed to be.
Related to the Find Identical GP tool, there is a Delete Identical GP tool that will delete records in a Feature Class if the geometry field is selected for a parameter. Like the Find Identical, this tool uses an XY tolerance so I suspect won't get everything but quite a bit.