I have a feature class that I need to compare two of each feature and the two fields have the same info, I want to delete them. I have some code that works for the table but I know that working with spatial data I may get inconsistent results. I made the following but I am uncertain I am dong it correctly, and I don't get any errors.
import arcpy
fc = r'C:\Temp\test.shp'
fields = ['OID@', 'PN', 'ACT', 'ACRES']
# if need just two field use ['OID@','PN', 'ACT']
sort_field, sort_order = "OBJECTID", "ASC"
fields = fields
sql_orderby = "ORDER BY {}, {} {}".format(", ".join(fields), sort_field, sort_order)
table_rows = []
with arcpy.da.UpdateCursor(fc, fields, sql_clause=(None, sql_orderby)) as cursor:
for row in cursor:
if row[1:] in table_rows:
cursor.deleteRow()
else:
pass
table_rows.append(row[1:])
del table_rows
I would also like to see if I can use the same code to update a field "Duplicate" with the text "Dup", if they are duplicates, so I can double check that the correct features are being deleted on a copy of the feature class.
you have ruled out
Find Identical (Data Management)—ArcGIS Pro | Documentation
to see if it works,
then use
Delete Identical (Data Management)—ArcGIS Pro | Documentation
if it does.
You can use geometry and/or attribute fields to provide what constitutes a duplicate
I have looked into it, but this snip-it of code will be part of a larger code and it would be nice to not have to create new files then have to join them back.
Here are some options that may help. You may need to compare each unique value to see if there are truly any duplicates to a list of all of those values.
import arcpy
fc = r'C:\Temp\test.shp'
# Option A
fields = ['PN', 'ACT', 'ACRES']
# if need just two field use ['OID@','PN', 'ACT']
search = arcpy.da.SearchCursor(fc, fields)
update = arcpy.da.UpdateCursor(fc, fields)
values = [row for row in search]
values = set(tuple(i) for i in values)
i = 0
with update as cursor:
for row in cursor:
if row in values:
i += 1
if i >= 2:
cursor.deleteRow()
del cursor
#Option B
fields = ['OID@', 'PN', 'ACT', 'ACRES']
# if need just two field use ['OID@','PN', 'ACT']
search = arcpy.da.SearchCursor(fc, fields)
values = {row[0]: row[1:] for row in search}
setvalues = set(tuple(row[1:]) for row in search)
fields = ['OID@']
update = arcpy.da.UpdateCursor(fc, fields)
i = 0
with update as cursor:
for row in cursor:
if values[row[0]] in setvalues:
i += 1
if i >= 2:
cursor.deleteRow(row)
del cursor
I didn't see anything wrong with your code. It could also be that the values are truly unique, which is why it is difficult to see if there are any true duplicates.
Another thing you can do is create a list of each value and then convert the same list to a new set. If the length for each set of values is the same length as the overall length of values, then that might determine whether your values are truly unique or not. Just another suggestion.