Using count and group by in makeQueryTable_Management

DaveTaylor · ‎10-29-2013

I know you can't use the count and group by in the where clause of the MakeQueryTable_Management function but is there a way using python to count the rows that are the same once done?

I have the function working so that it exports to a .dbf file but instead of having them grouped by ID's it just has them all listed separately. For instance:

ID
1234
2345
2346
1234
2345

What I want is this:

ID       COUNT
1234       2
2345       2
2346       1

Is there something I can use in python to either group all the same id's together and add a new column to my dbf or is there something with ESRI I can use?

My end result is I want to join this to a shapefile I already have and color code them by counts and have it update on the fly.

Thanks in advance!

DouglasSands · ‎10-29-2013

Something along the lines of the following code should work:

from collections import Counter

def count_duplicates(path_to_data, field_to_count):
    #Add the count field
    data_layer = arcpy.MakeFeatureLayer_management(path_to_data, 'data')
    arcpy.AddField_management(data_layer, 'COUNT', 'LONG')
    arcpy.Delete_management(data_layer)
    del data_layer

    #Do the counting:
    value_list = []
    with arcpy.da.SearchCursor(path_to_data, [field_to_count]) as search_rows:
        for row in search_rows:
            value_list.append(row[0])

    counts = Counter(value_list)
    del value_list
    
    with arcpy.da.UpdateCursor(path_to_data, [field_to_count, 'COUNT']) as update_rows:
        for row in udpate_rows:
            row[1] = counts[row[0]]
            update_rows.updateRow(row)

DaveTaylor · ‎10-29-2013

Thanks. This is what I put and I am getting a syntax error

def count_duplicates("I/live_data.dbf",'id'):
    #Add the count field
    data_layer = arcpy.MakeFeatureLayer_management("I:/live_data.dbf", 'data')
    arcpy.AddField_management(data_layer, 'COUNT', 'LONG')
    arcpy.Delete_management(data_layer)
    del data_layer

#Do the counting:
    value_list = []
    with arcpy.da.SearchCursor("I/live_data.dbf", ['id']) as search_rows:
        for row in search_rows:
            value_list.append(row[0])

    counts = Counter(value_list)
    del value_list

    with arcpy.da.UpdateCursor("I/live_data.dbf", ['id', 'COUNT']) as update_rows:
        for row in udpate_rows:
            row[1] = counts[row[0]]
            update_rows.updateRow(row)

DouglasSands · ‎10-29-2013

Using your example, to execute the function keep what I originally posted and add the following at the end:

count_duplicates("I/live_data.dbf",'id')

Since the code is defined as a function, you need to call it at the end rather than replacing the parameters. Also to be sure you have the correct file path you can navigate to it in Windows Explorer and hold shift, then right click on the file. This enables the option "Copy as text".

DouglasSands · ‎10-29-2013

You could also accomplish the same thing by using Summary Statistics, and then joining the resulting DBF table back to your source dataset. This could be easily done in model builder also.

DuncanHornby · ‎10-29-2013

Unless I'm missing the point, calling the Summary Statistics tool will do this in one line of code?

arcpy.Statistics_analysis("Export_Output","in_memory/x",[["ID","COUNT"]],"ID")

This creates a table called X in the in_memory workspace and returns a count on a field called ID.

DouglasSands · ‎10-29-2013

It would. I overthought the issue initially, although cursors are usually faster with large data sets. The join back is necessary so that the features can be symbolized by the count.

DaveTaylor · ‎10-29-2013

Unless I'm missing the point, calling the Summary Statistics tool will do this in one line of code?
arcpy.Statistics_analysis("Export_Output","in_memory/x",[["ID","COUNT"]],"ID")
This creates a table called X in the in_memory workspace and returns a count on a field called ID.

How do I get this out of memory and saved as an actual .dbf file?

DaveTaylor · ‎10-29-2013

nevermind I think I got it!

Thanks that appears to work perfectly!