<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Efficient counting of features occurrences  per attribute value in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284671#M67564</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have over 2700 FileGDBs in which to count feature occurrences based on an attribute value.&lt;/P&gt;&lt;P&gt;The attribute field (FeaType) has over 200 possible values. (Typically: Tree, Road, Building, etc etc.)&lt;/P&gt;&lt;P&gt;I need to count the occurences of each feature type, per FileGDB and put them in a table with 1 row per FileGDB and 1 column per FeaType value.&lt;/P&gt;&lt;P&gt;Putting them into a table is no issue - I am using xlsxwriter.&lt;/P&gt;&lt;P&gt;What would be the most efficient way to count these features?&lt;/P&gt;&lt;P&gt;I have tried iterating through each FileGDB, each FeatureClass, row by row incrementing a table entry based on the FeaType value - very slow &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I could try iterating through each FeaType value, then using that value to 'select' and 'getcount'.&lt;/P&gt;&lt;P&gt;But surely there is a more efficient way?&lt;/P&gt;&lt;P&gt;Some pointers would be great.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Zoltan&lt;/P&gt;</description>
    <pubDate>Tue, 02 May 2023 12:43:22 GMT</pubDate>
    <dc:creator>ZoltanSzecsei</dc:creator>
    <dc:date>2023-05-02T12:43:22Z</dc:date>
    <item>
      <title>Efficient counting of features occurrences  per attribute value</title>
      <link>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284671#M67564</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have over 2700 FileGDBs in which to count feature occurrences based on an attribute value.&lt;/P&gt;&lt;P&gt;The attribute field (FeaType) has over 200 possible values. (Typically: Tree, Road, Building, etc etc.)&lt;/P&gt;&lt;P&gt;I need to count the occurences of each feature type, per FileGDB and put them in a table with 1 row per FileGDB and 1 column per FeaType value.&lt;/P&gt;&lt;P&gt;Putting them into a table is no issue - I am using xlsxwriter.&lt;/P&gt;&lt;P&gt;What would be the most efficient way to count these features?&lt;/P&gt;&lt;P&gt;I have tried iterating through each FileGDB, each FeatureClass, row by row incrementing a table entry based on the FeaType value - very slow &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I could try iterating through each FeaType value, then using that value to 'select' and 'getcount'.&lt;/P&gt;&lt;P&gt;But surely there is a more efficient way?&lt;/P&gt;&lt;P&gt;Some pointers would be great.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Zoltan&lt;/P&gt;</description>
      <pubDate>Tue, 02 May 2023 12:43:22 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284671#M67564</guid>
      <dc:creator>ZoltanSzecsei</dc:creator>
      <dc:date>2023-05-02T12:43:22Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient counting of features occurrences  per attribute value</title>
      <link>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284679#M67565</link>
      <description>&lt;P&gt;Multiprocess it so that a thread works on one fgdb at a time, and returns a dictionary of counts for features.&amp;nbsp; When the threads are done, combine all dictionaries for the total sum. What code do you have so far?&lt;/P&gt;</description>
      <pubDate>Tue, 02 May 2023 13:05:19 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284679#M67565</guid>
      <dc:creator>Anonymous User</dc:creator>
      <dc:date>2023-05-02T13:05:19Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient counting of features occurrences  per attribute value</title>
      <link>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284690#M67566</link>
      <description>&lt;P&gt;Here's a little sample of what I am thinking- you can modify the fields to be more dynamic per fc but this should get you started.&lt;/P&gt;&lt;P&gt;Edited to go by the field FeaType and key off attribute value.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;def get_count(fgdb):
    env.workspace = fgdb
    fcDict = {'fgdb': f'{fgdb}', 'status': True, 'featVals': {'fc': None, 'feaTypeCnt': {}}}

    for fc in arcpy.ListFeatureClasses():
        fields = [f.name for f in arcpy.ListFields(fc) if f.name == 'FeaType']
        fcDict['featVals']['fc'] = os.path.basename(fc)
        if fields:
            with arcpy.da.SearchCursor(fc, fields) as sCur:
                for row in sCur:
                    if fcDict['featVals']['feaTypeCnt'].get(row[0]):
                        fcDict['featVals']['feaTypeCnt'][row[0]] = fcDict['featVals']['feaTypeCnt'][row[0]] + 1
                    else:
                        fcDict['featVals']['feaTypeCnt'][row[0]] = 1
        else:
            fcDict['status'] = 'Did not contain FeaType'

    return fcDict


if __name__ == '__main__':
    workspace = r"C:\Path\to\explore"
    gdbs = []
    for dirpath, dirnames, filenames in arcpy.da.Walk(workspace, datatype="Container"):
        for dirname in dirnames:
            if ".gdb" in dirname:
                gdbs.append(os.path.join(dirpath, dirname))

    cores = mp.cpu_count()
    with mp.Pool(processes=cores) as pool:
        jobs = [pool.apply_async(get_count, (gdb,)) for gdb in gdbs]

        res = [r.get() for r in jobs]

    for r in res:
        if r['status']:
            vals = r['featVals']
            print(f'{r["fgdb"]} : {vals["fc"]}')
            for k, v in vals['feaTypeCnt'].items():
                print(f'\t{k}: {v}')
        else:
            print(f'{r["fgdb"]} {r["featVals"]["fc"]} {r["status"]}')&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 02 May 2023 15:24:35 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1284690#M67566</guid>
      <dc:creator>Anonymous User</dc:creator>
      <dc:date>2023-05-02T15:24:35Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient counting of features occurrences  per attribute value</title>
      <link>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1286574#M67592</link>
      <description>&lt;P&gt;Assuming I understand your intent, I would use Pandas and the Spatially Enabled Dataframe for this task. I haven't tested this code, but assuming you have a list of FGDB paths and are always using the same feature class name, this might work as is.&lt;/P&gt;&lt;P&gt;I've found that if&amp;nbsp;you give Pandas a list of dictionaries, it will use the keys as columns, and the keys don't have to be identical in each dictionary.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import pandas as pd
import os
from arcgis.features import GeoAccessor, GeoSeriesAccessor

data = []
for this_gdb_path in my_list_of_gdb_paths:
        this_fc_path = os.path.join(this_gdb_path, "ConstantFeatureClassName")
	sdf = pd.DataFrame.spatial.from_featureclass(this_fc_path)
	this_dict = sdf.FeaType.value_counts().to_dict()
	this_dict['gdb_path'] = this_gdb_path
	data.append(this_dict)

df = pd.DataFrame(data)
df.to_excel('FeaTypesPerGDB.xlsx')&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2023 00:27:22 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/efficient-counting-of-features-occurrences-per/m-p/1286574#M67592</guid>
      <dc:creator>GISErik</dc:creator>
      <dc:date>2023-05-08T00:27:22Z</dc:date>
    </item>
  </channel>
</rss>

