I have a feature class where there are multiple values duplicated in a row, i.e. SRNumber and RESOLUTION_CODE. I would like to delete all rows where my SRNumber and RESOLUTION_CODE are found in multiple rows. I have got a good start with this script, everything works aside from actually deleting the rows. How can I fix this so that my duplicate rows are deleted?
So if SRNumber = 1-23456 and Resolution_Code = 'A' and appears twice, I would like to delete one.
I have taken this example for GIS Stackexchange, but the del rows does not work correctly.
import arcpy
...
... in_table = r"ServiceRequest.DBO.SO_Dead_Animal"
... fields = ["NumberCYLA","RESOLUTION_CODE"] #replace with own
... curStrings = arcpy.da.SearchCursor(in_table,fields,"NumberCYLA is not null")
...
... stringsKeysList = []
... numbersValuesList = []
... tableDict = {}
...
... for row in curStrings:
... print row[0],row[1]
... stringsKeysList.append(row[0])
... del curStrings
... uniqueStringsList = list(set(stringsKeysList)) #only unique values from Strings field
...
... for uniqueString in uniqueStringsList:
... emptyTempValues = []
... curValues = arcpy.da.SearchCursor(in_table,fields,"""NumberCYLA = '{0}'""".format(str(uniqueString)))
... for row in curValues:
... print row[1]
... emptyTempValues.append(row[1])
... uniqueEmptyTempValues = list(set(emptyTempValues))
... tableDict[uniqueString] = uniqueEmptyTempValues
...
... print tableDict
...
... rowsToDelete = [k for k, v in tableDict.iteritems() if v == [-1]] #find which rows to delete
... print "list: ", rowsToDelete
... stringRows = ","
... updRows = stringRows.join(rowsToDelete)
... updRows = updRows.replace(",","','")
... print updRows #update the string to be used with proper syntax in where clause
...
... where_clause = """NumberCYLA in ('{0}')""".format(updRows)
... upd_cur = arcpy.da.UpdateCursor(in_table,fields,where_clause)
... for row in upd_cur:
... print row[0],row[1]
... del rowsToDelete
I think you want to call deleteRow() rather than del rowsToDelete, which just deletes the Python variable called "rowsToDelete".
... upd_cur = arcpy.da.UpdateCursor(in_table,fields,where_clause) ... for row in upd_cur: ... upd_cur.deleteRow() # Beware: deletes the current row
You can also just run the Delete Identical tool under Data Management, General. A whole lot easier.
Just FYI, Delete Identical requires advanced licensing, while this script requires none.
Just to clarify the problem you are trying to solve...if this is your attribute table:
| OBJECTID* | NumberCYLA | SRNumber | Resolution_Code |
|---|---|---|---|
| 1 | 11 | 1-23456 | A |
| 2 | 22 | 1-9987 | B |
| 3 | 33 | 1-23456 | A |
| 4 | 44 | 1-5555 | C |
...you would want the 3rd row deleted because it is a duplicate of row 1 (SRNumber and Resolution_Code match), correct? The code sample you grabbed is not particularly readable (to me) so I just wanted to make sure.
Also, is the field NumberCYLA a unique identifier for your feature class?
@Adam Marinelli, that is correct, it should be mentioned that NUMBERCYLA and SRNumber are identical. But your example is correct, if I have one Service Request number repeated at different objectIDs, I would like to remove one.
If you have the proper licensing then definitely use Delete Identical as Justin Hollenbach suggested....there is a tool for everything! Otherwise, here is a much simpler python script which should do what you need. I haven't tested it with a very large feature class so performance could be an issue depending on the # of records in your input datasets.
The performance of the script can be improved by cutting out the search cursor loop. All of the functional code works within a single update cursor:
import arcpy
fc = #full path to feature class
fields = ['OBJECTID', 'SRNumber', 'Resolution_Code']
table_rows = []
with arcpy.da.UpdateCursor(fc, fields) as cursor:
for row in cursor:
if row[1:] in table_rows:
print "Deleting record: OBJECTID = {}".format(row[0])
cursor.deleteRow()
else:
table_rows.append(row[1:])
del table_rows