Select to view content in your preferred language

Delete duplicate issue

884
3
Jump to solution
02-10-2020 09:24 AM
2Quiker
Frequent Contributor

I am trying to delete duplicates form a table but the issues is the part where I try to delete the duplicates.

The script merges table together then delete the duplicates. After the script runs i noticed that there were still duplicates so i ran just the delete duplicates part of the script in a stand alone script and it removes totally different duplicates so i am confused as to why the delete duplicates part removes duplicates differently. Any ideas?

If i run the following

import arcpy
from datetime import datetime as d
import logging, os
startTime = d.now()
start_time = time.time()

arcpy.env.overwriteOutput = True
   
arcpy.env.workspace = r"C:\\Users\\***\\AppData\\Roaming\\Esri\\ArcGISPro\\Favorites\\***.sde"

# Get a list of all the tables.
tableListA = arcpy.ListTables('DSD.DBO.SubdivisionList')

# Loop through the list and run truncate
for tableA in tableListA:
    arcpy.TruncateTable_management(tableA)

#SDE tables and feature classes
tableA = r'C:\\Users\\SubdivisionList' #table that gets updated from ORN and Centerline
tableB = r'C:\Users\\Subdivision_boundaries'
tableC = r'C:\Users\PreliminarySubs'
tableD = r"C:\Users\Official_Subdivision_Names"

####Compare and inject missing street names from ORN table
tableAList = []
with arcpy.da.SearchCursor(tableA, ['PLAT_NAME']) as cursor:
    for row in cursor:
        tableAList.append(row[0])
del cursor

#list of fields whose will need to be imported into tableA
fields = ['PLAT_NAME']

#list of values from tableA to inject into tableC
ids = []

with arcpy.da.SearchCursor(tableC, ['SUBNAME']) as cursor:
    for row in cursor:
            if row[0] not in tableAList:
                ids.append(row[0:])
del cursor

#create insert cursor variable
insertCursor = arcpy.da.InsertCursor(tableA,fields)

#loop through items in ids list and insert into table
for i in ids:
    insertCursor.insertRow(i)
del insertCursor
del ids
###Compare and inject missing street names and Status field from centerline

tableAList = []
with arcpy.da.SearchCursor(tableA, ['PLAT_NAME', 'Year']) as cursor:
    for row in cursor:
        tableAList.append(row)
del cursor


#list of fields whose will need to be imported into tableA
fields = ['PLAT_NAME', 'Year']

#list of values from tableB to inject into tableA
ids = []

with arcpy.da.SearchCursor(tableB, ['PLAT_NAME', 'Year']) as cursor:
    for row in cursor:
            if row[0] not in tableAList:
                ids.append(row)
del cursor

#create insert cursor variable
insertCursor = arcpy.da.InsertCursor(tableA,fields)

#loop through items in ids list and insert into table
for i in ids:
    insertCursor.insertRow(i)
del insertCursor
del ids
print ("Finihsed Comparing and inject missing street names from Preliminary Plats")
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')

###### Compare replicated Subnames From SDE
######

table1List = []
with arcpy.da.SearchCursor(tableA, ['PLAT_NAME', 'Year']) as cursor:
    for row in cursor:
        table1List.append(row[0])
del cursor

#list of fields whose will need to be imported into table1
fields = ['PLAT_NAME', 'Year']

#list of values from table1 to inject into table3
ids = []

with arcpy.da.SearchCursor(tableD, ['PLAT_NAME', 'Year']) as cursor:
    for row in cursor:
            if row[0] not in table1List:
                ids.append(row[0:])
del cursor


#create insert cursor variable
insertCursor = arcpy.da.InsertCursor(tableA,fields)

#loop through items in ids list and insert into table
for i in ids:
    insertCursor.insertRow(i)
del insertCursor
del ids
print ("Finihsed Comparing and inject missing Subs names from replicated Subnames From SDE")
print ('(Elapsed time: ' + str(d.now() - startTime)[:-3] + ')')

#Deletes duplicate records after merging centerline and ONR
fc = r'C:\\Users\\SubdivisionList'  
fields1 = ['OBJECTID', 'PLAT_NAME']  
  
table_rows = []  
with arcpy.da.UpdateCursor(fc, fields1) as cursor1:  
    for row in cursor1:      
        if row[1:] in table_rows:  
            print ("Deleting record: OBJECTID = {}".format(row[0])) 
            cursor1.deleteRow()  
        else:  
            table_rows.append(row[1:])  
del table_rows
del cursor1
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

It removes only these

Deleting record: OBJECTID = 1817
Deleting record: OBJECTID = 1883
Deleting record: OBJECTID = 1938
Deleting record: OBJECTID = 1939
Deleting record: OBJECTID = 2150
Deleting record: OBJECTID = 2151
Deleting record: OBJECTID = 2152
Deleting record: OBJECTID = 2153
Deleting record: OBJECTID = 2154
Deleting record: OBJECTID = 2155
Deleting record: OBJECTID = 2156
Deleting record: OBJECTID = 2157
Deleting record: OBJECTID = 2158
Deleting record: OBJECTID = 2159
Deleting record: OBJECTID = 2160
Deleting record: OBJECTID = 2161
Deleting record: OBJECTID = 2162
Deleting record: OBJECTID = 2163
Deleting record: OBJECTID = 2164
Deleting record: OBJECTID = 2165
Deleting record: OBJECTID = 2166
Deleting record: OBJECTID = 2167

If I just run the deleterows after the above script it deletes more. The delete row in the above script should delete all the duplicates..?

import arcpy  
fc = r'C:\\Users\\SubdivisionList'  
fields = ['OBJECTID', 'PLAT_NAME']  
  
table_rows = []  
with arcpy.da.UpdateCursor(fc, fields) as cursor:  
    for row in cursor:      
        if row[1:] in table_rows:  
            print ("Deleting record: OBJECTID = {}".format(row[0])) 
            cursor.deleteRow()  
        else:  
            table_rows.append(row[1:])  
del table_rows  ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

It deletes

Deleting record: OBJECTID = 2
Deleting record: OBJECTID = 11
Deleting record: OBJECTID = 311
Deleting record: OBJECTID = 419
Deleting record: OBJECTID = 545
Deleting record: OBJECTID = 640
Deleting record: OBJECTID = 945
Deleting record: OBJECTID = 976
Deleting record: OBJECTID = 1003
Deleting record: OBJECTID = 1076
Deleting record: OBJECTID = 1103
Deleting record: OBJECTID = 1286
Deleting record: OBJECTID = 1518
Deleting record: OBJECTID = 1705
Deleting record: OBJECTID = 1884
Deleting record: OBJECTID = 1951
Deleting record: OBJECTID = 2069
Deleting record: OBJECTID = 2071

0 Kudos
1 Solution

Accepted Solutions
DanPatterson_Retired
MVP Emeritus
3 Replies
DanPatterson_Retired
MVP Emeritus

Why don't you use the Delete Identical interface it handles things much more efficiently

Find Identical—Data Management toolbox | Documentation 

Delete Identical—Data Management toolbox | Documentation 

2Quiker
Frequent Contributor

After trying the delete Identical like you suggested i was still getting duplicates. I found that it was due to fact that i had the delete duplicates before the arcpy.CalculateField_management to making the field all upper case, once i put the delete duplicate block after arcpy.CalculateField_management it was fine. Some of the tables had lower case while others had lower case. I didn't think that would make a difference if they were lower vs upper case.

0 Kudos
DanPatterson_Retired
MVP Emeritus

cAsE MaTtErs

0 Kudos