Get a count of field values

JoeBorgione · ‎01-08-2020

I'm working with some data that is a ridiculously large table (184 fields) and I would like to asses how many instances of of each field are non-null. In other words if the values are more more often null, I'll drop the field(s). For now I've written a script that performs an iterative selection for each field, and as you can imagine with 184 fields and a few thousand records it's pretty slow. I tried a couple other approaches that failed, but I have to think there is a better way to get a count of records for which a given field is populated. Here is what I've done to date:

import arcpy

table = r'J:\some\path\to\file.gdb\tableName'
fields = []

for f in arcpy.ListFields(table):
    fields.append(f.name)
 
arcpy.MakeTableView_management(table,'tv')

for f in fields:
    select = f'{f} is not null'
    arcpy.SelectLayerByAttribute_management('tv','NEW_SELECTION',select)
    c = arcpy.GetCount_management('tv')
    
    print('Field {} has {} non-null records'.format(f,c[0]))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

That should just about do it....

DanPatterson_Retired · ‎01-08-2020

numpy comes with pro... in fact arcgis pro requires numpy, so does the arcgis module, and pandas scipy etc etc

this can be done with or without a clone.

Of course, as in all modules, you have to import numpy which is usually do as

import numpy as np‍

most python IDEs allow you to set default imports if you feel lazy or are forgetful.

wwnde · ‎01-13-2020

@hziegler-esristaff way is what I have always used. ArcPro 2.15 now allows toggling between desktop and spatially enabled dataframes. Additionally from the python API you can now launch, code within spatially referenced dataframes and save and share as an item in ArcOnline. Apart from using numpy, can also use pandas.

import pandas as pd
df = pd.read_csv(r'file directory')
df.isna().sum() # for entire dataframe
df.fieldname.isna()

The advantage with spatially dataframes is it levarages numpy and pandas to access statistical abilities that can otherwise be accessed 'cumbersomely' in arcpy. It also accords excellent visualization in matplotlib and seaborn python libraries.