I have a feature class where there are multiple values duplicated in a row, i.e. SRNumber and RESOLUTION_CODE. I would like to delete all rows where my SRNumber and RESOLUTION_CODE are found in multiple rows. I have got a good start with this script, everything works aside from actually deleting the rows. How can I fix this so that my duplicate rows are deleted?
So if SRNumber = 1-23456
and Resolution_Code = 'A'
and appears twice, I would like to delete one.
I have taken this example for GIS Stackexchange, but the del rows does not work correctly.
import arcpy ... ... in_table = r"ServiceRequest.DBO.SO_Dead_Animal" ... fields = ["NumberCYLA","RESOLUTION_CODE"] #replace with own ... curStrings = arcpy.da.SearchCursor(in_table,fields,"NumberCYLA is not null") ... ... stringsKeysList = [] ... numbersValuesList = [] ... tableDict = {} ... ... for row in curStrings: ... print row[0],row[1] ... stringsKeysList.append(row[0]) ... del curStrings ... uniqueStringsList = list(set(stringsKeysList)) #only unique values from Strings field ... ... for uniqueString in uniqueStringsList: ... emptyTempValues = [] ... curValues = arcpy.da.SearchCursor(in_table,fields,"""NumberCYLA = '{0}'""".format(str(uniqueString))) ... for row in curValues: ... print row[1] ... emptyTempValues.append(row[1]) ... uniqueEmptyTempValues = list(set(emptyTempValues)) ... tableDict[uniqueString] = uniqueEmptyTempValues ... ... print tableDict ... ... rowsToDelete = [k for k, v in tableDict.iteritems() if v == [-1]] #find which rows to delete ... print "list: ", rowsToDelete ... stringRows = "," ... updRows = stringRows.join(rowsToDelete) ... updRows = updRows.replace(",","','") ... print updRows #update the string to be used with proper syntax in where clause ... ... where_clause = """NumberCYLA in ('{0}')""".format(updRows) ... upd_cur = arcpy.da.UpdateCursor(in_table,fields,where_clause) ... for row in upd_cur: ... print row[0],row[1] ... del rowsToDelete
I think you want to call deleteRow() rather than del rowsToDelete, which just deletes the Python variable called "rowsToDelete".
... upd_cur = arcpy.da.UpdateCursor(in_table,fields,where_clause) ... for row in upd_cur: ... upd_cur.deleteRow() # Beware: deletes the current row
You can also just run the Delete Identical tool under Data Management, General. A whole lot easier.
Just FYI, Delete Identical requires advanced licensing, while this script requires none.
Just to clarify the problem you are trying to solve...if this is your attribute table:
OBJECTID* | NumberCYLA | SRNumber | Resolution_Code |
---|---|---|---|
1 | 11 | 1-23456 | A |
2 | 22 | 1-9987 | B |
3 | 33 | 1-23456 | A |
4 | 44 | 1-5555 | C |
...you would want the 3rd row deleted because it is a duplicate of row 1 (SRNumber and Resolution_Code match), correct? The code sample you grabbed is not particularly readable (to me) so I just wanted to make sure.
Also, is the field NumberCYLA a unique identifier for your feature class?
@Adam Marinelli, that is correct, it should be mentioned that NUMBERCYLA and SRNumber are identical. But your example is correct, if I have one Service Request number repeated at different objectIDs, I would like to remove one.
If you have the proper licensing then definitely use Delete Identical as Justin Hollenbach suggested....there is a tool for everything! Otherwise, here is a much simpler python script which should do what you need. I haven't tested it with a very large feature class so performance could be an issue depending on the # of records in your input datasets.
The performance of the script can be improved by cutting out the search cursor loop. All of the functional code works within a single update cursor:
import arcpy fc = #full path to feature class fields = ['OBJECTID', 'SRNumber', 'Resolution_Code'] table_rows = [] with arcpy.da.UpdateCursor(fc, fields) as cursor: for row in cursor: if row[1:] in table_rows: print "Deleting record: OBJECTID = {}".format(row[0]) cursor.deleteRow() else: table_rows.append(row[1:]) del table_rows