find duplicates values in feature class

4119
6
06-30-2013 10:44 AM
RichardMoussopo
Occasional Contributor III
Hi,
I want to check if a specified field has duplicate records. is there to achieve this? I am using VB.NET.
Please any help will be helpful.
0 Kudos
6 Replies
RichardFairhurst
MVP Honored Contributor
Hi,
I want to check if a specified field has duplicate records. is there to achieve this? I am using VB.NET.
Please any help will be helpful.


Summary Statistics and then look for values with a count of 2 or higher.  Relate or join to get the selection on the original features.  Then do whatever you need to do to deal with them.  This method is so fast and easy, I have not found a reason to program an alternative.
0 Kudos
CarstenSchumann
Occasional Contributor
Another alternative that should work too is performing a dissolve on your featureClass. This process aggregates features based on a given field and adds a field to the output-fc which holds the number of aggregated features...
0 Kudos
RichardMoussopo
Occasional Contributor III
Summary Statistics and then look for values with a count of 2 or higher.  Relate or join to get the selection on the original features.  Then do whatever you need to do to deal with them.  This method is so fast and easy, I have not found a reason to program an alternative.


Thank you for your reply. But how do you achieve that?
here is my code for it but unfortunately, it counting the total features in the field.

Try
        'Defining the Layer selected by the end user
    
        Dim layer As String = ComboBoxLayerName.SelectedItem
            'Converting the layer into a feature class
            Dim fclass As IFeatureClass = flayer(layer).FeatureClass
            
             'defining the field to be checked

            Dim fieldToCheck As IField = Nothing
            Dim fields As IFields = fclass.Fields
            Dim t As Integer = 0
            Do While (t < fields.FieldCount)
                If fields.Field(t).Name = ComboBoxFieldName.SelectedItem Then
                    fieldToCheck = fields.Field(t)
                    Exit Do
                End If
                t = t + 1

            Loop

            'define the cursor to store data
            Dim pCursor As ICursor
            'define datastat to query unique values
            Dim pDataStats As New ESRI.ArcGIS.Geodatabase.DataStatistics
            'define data collection
            Dim pEnum As System.Collections.IEnumerator
 
            Dim duplicateNumber As Integer
            Dim duplicateValue As Integer = pDataStats.UniqueValueCount  '(DuplicateValue here returns the total records in the field)

            ListViewResults.Items.Clear()
            pCursor = fclass.Search(Nothing, False)
            pDataStats.Field = fieldToCheck.Name
            pDataStats.Cursor = pCursor
            pEnum = pDataStats.UniqueValues   '(pEnum get Unique Values in the field)

            pEnum.Reset()

            Dim pListItem As New ListViewItem
            Do While pEnum.MoveNext
                duplicateNumber = pDataStats.UniqueValueCount
                ListViewResults.Items.Add(pEnum.Current)
                ListViewResults.Items(0).SubItems.Add(pDataStats.UniqueValueCount.ToString) '(return the total records number but not the number for each repeated records)
               
            Loop


         
        Catch ex As Exception
            MessageBox.Show(ex.Message)
        End Try

0 Kudos
DuncanHornby
MVP Notable Contributor
Richard,

I would consider abandoning your approach of using the IDataStatistics interface as ESRI introduced a bug with the 10.1 SP1 release that is some sort of memory leak and will cause your application to fail. It is discussed in this thread.I had terrible problems with my application failing until I stopped using it. Hopefully they will fix it in 10.2

Richards idea of calling the summary tool is probably the easiest way of doing things.

Duncan
0 Kudos
RichardMoussopo
Occasional Contributor III
Richard,

I would consider abandoning your approach of using the IDataStatistics interface as ESRI introduced a bug with the 10.1 SP1 release that is some sort of memory leak and will cause your application to fail. It is discussed in this thread.I had terrible problems with my application failing until I stopped using it. Hopefully they will fix it in 10.2

Richards idea of calling the summary tool is probably the easiest way of doing things.

Duncan


Thank you Duncan, you are right about IDataStatistics, it keeps failing after several runs. I don't seem to find Summary Command to be able to call it. any code for it please?
0 Kudos
DuncanHornby
MVP Notable Contributor
Richard,

Richard was talking about calling the existing Summary statistics tool. You would call it using the IGeoProcessor interface. This page shows an example of the calculate geo-processing tool being called. You can even call entire models.

Duncan