How to find duplicate records with ArcPy

1424
2
11-02-2017 10:00 AM
JoseSanchez
Occasional Contributor III

Hello everyone,

The goal of this process is to dissolve polygons with the same ID (ContentID)

To do that  I created a SQL query and run it  pointing to the database, not using SDE. This solution  works but I would like to replace it with a code that uses only SDE GIS access. I think it needs  "Select By Attributes"

Let me know how can I replace this code

       Dim mySQLQuery As String = " SELECT CONTENTID, COUNT(CONTENTID) AS Counter " _
                                  & "FROM (TICKET) " _
                                  & "GROUP BY CONTENTID " _
                                  & "HAVING(COUNT(CONTENTID) > 1) " _
                                  & "ORDER BY Counter DESC "

        Dim myReaderAs SqlDataReader

        ' Initialize the Geoprocessor
        '
        Dim GP As ESRI.ArcGIS.Geoprocessor.Geoprocessor = New ESRI.ArcGIS.Geoprocessor.Geoprocessor()

        ' Create the GPUtilites object
        Dim gputilities As IGPUtilities = New GPUtilitiesClass()

        '
        Dim Err As CreateLogFiles = New CreateLogFiles()


        '
        Try
        
            Dim pFeatureClass As String = pathOutput
            Dim pFeatureDissolve As String = pathDissolve

            Dim pDeleteTool As Delete
            Dim pMakeFeatureLayer As MakeFeatureLayer
            Dim pDissolve As Dissolve

            Dim pDeleteFeatures As DeleteFeatures
            Dim dt As Object = ""
            Dim bExists As Boolean

            Dim myCONTENTID As String

            Dim command As New SqlCommand(mySQLQuery, mySQLConn)
            mySQLConn.Open()

            myReader= command.ExecuteReader()


            ' Always call Read before accessing data.

            ' Set the OverwriteOutput setting to True
            GP.OverwriteOutput = True

            If myReader.HasRows Then    ------------------------------ LOOP

                While myReader.Read()

                    '
                    ' searchs for records with ContentId duplicated in Ticket
                    ' deletes if exists existing feature class TicketDissolve
                    ' dissolves it in TicketDissolve
                    ' removes features from source Ticket
                    ' copies dissolved record into Ticket
                    '

                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////
                    ' STEP 1: Delete Existing feature class TicketDissolve
                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////

                    bExists = GP.Exists(pFeatureDissolve, dt)

                    If bExists Then
                        pDeleteTool = New Delete()

                        pDeleteTool.in_data = pFeatureDissolve
                        pDeleteTool.data_type = "FeatureClass"


                        RunTool(GP, pDeleteTool, Nothing)
                        Console.WriteLine("Delete TicketDissolve ")

                    End If

                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////
                    ' STEP 2: Make feature layer using the MakeFeatureLayer tool for the current ContentID .
                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////


                    pMakeFeatureLayer = New MakeFeatureLayer()

                    ' Process: Make Feature Layer...

                    pMakeFeatureLayer.in_features = pFeatureClass
                    pMakeFeatureLayer.out_layer = "Ticket_Layer"
                    myCONTENTID = myReader.GetString(0)
                    pMakeFeatureLayer.where_clause = "CONTENTID ='" & myCONTENTID & "'"
                    pMakeFeatureLayer.field_info = "TICKETNO TICKETNO ...;SHAPE_Length SHAPE_Length VISIBLE NONE;SHAPE_Area SHAPE_Area VISIBLE NONE"

                    RunTool(GP, pMakeFeatureLayer, Nothing)
                    Console.WriteLine("Content ID: " & myReader.GetString(0))
                    Debug.WriteLine(myReader.GetString(0))

                    Console.WriteLine("Records Count: " & myReader.GetInt32(1))


                    Dim pGetCount As GetCount = New GetCount()

                    pGetCount.in_rows = "Ticket_Layer"
                    RunTool(GP, pGetCount, Nothing)

                    Debug.WriteLine(pGetCount.row_count)


                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
                    ' STEP 3: Dissolve features .
                    '/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////


                    pDissolve = New Dissolve()

                    pDissolve.in_features = "Ticket_Layer"
                    pDissolve.out_feature_class = pFeatureDissolve
                    ' pDissolve.dissolve_field = ""
                    pDissolve.dissolve_field = "TICKETNO;VERNO;.....;CONTENTID"
                    pDissolve.multi_part = "MULTI_PART"

                    pDissolve.unsplit_lines = "DISSOLVE_LINES"


                    ' Process: Dissolve...

                    RunTool(GP, pDissolve, Nothing)
                    Console.WriteLine("Dissolve ")


                 

                    pAppend.inputs = pFeatureDissolve
                    pAppend.target = pFeatureClass
                    pAppend.schema_type = "TEST"


                    .......................


                End While

            End If

0 Kudos
2 Replies
MitchHolley1
MVP Regular Contributor

You should be able to find unique values with the arcpy.da module.  

import arcpy

inFeatures = r'path_to_feature_class'

outPath = r'path_to_output_database'

unique = set(row[0] for row in arcpy.da.SearchCursor(feautre, ['ID'])) #all unique values

arcpy.MakeFeautreLayer_management(inFeatures, 'lyr') #make layer to query

#loop over unique values, and select then dissolve
for u in unique:
    arcpy.SelectLayerByAttributes_management('lyr', 'NEW_SELECTION', "ID = {0}".format(u))
    outFC = os.path.join(outPath, u)
    arcpy.Dissolve_management('lyr', outFC)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

EDIT:  I'm sorry, I didn't originally understand the post.  The updated code above may get you started. 

JoseSanchez
Occasional Contributor III

I dissolving all the features with the same ContentID, but first I need to find the ones that repeated to dissolve them.

0 Kudos