Hello,
For every field within every feature class in my dataset, I would like to produce a list with the following data:
1. Field Name
2. Field Data Type
3. Field Length
4. Field Domain
5. Count of NULL values in field
I have a script that produces 1-4 but can't seem to figure out how to produce number 5.
My current script:
arcpy.env.workspace = r"C:\gdblocation"
fcList = arcpy.ListFeatureClasses("*")
for fc in fcList:
print(fc)
fields = arcpy.ListFields(fc)
for field in fields:
print("{0},{1},{2}"
.format(field.name, field.type, field.length, field.domain))
Example of current output for each field in each feature class:
What I'm trying to produce:
Any advice?
Thanks!
Solved! Go to Solution.
arcpy.env.workspace = r"C:\gdblocation"
fcList = arcpy.ListFeatureClasses("*")
for fc in fcList:
print(fc)
fields = arcpy.ListFields(fc)
field_names = [f.name for f in fields]
null_counts = {fn: 0 for fn in field_names}
with arcpy.da.SearchCursor(fc, field_names) as cursor:
for row in cursor:
for i, v in enumerate(row):
if v is None:
null_counts[field_names[i]] += 1
for field in fields:
print("{0},{1},{2},{3}".format(field.name, field.type, field.length, field.domain, null_counts[field.name]))
Your current code produces the record values for each row. If a null value is encountered, None is returned so you can simply print it, using python.
str(!FieldName!) # returns a string representation of None
When you get around to it you can query !FieldName! using count, to return the count of "None"
arcpy.env.workspace = r"C:\gdblocation"
fcList = arcpy.ListFeatureClasses("*")
for fc in fcList:
print(fc)
fields = arcpy.ListFields(fc)
field_names = [f.name for f in fields]
null_counts = {fn: 0 for fn in field_names}
with arcpy.da.SearchCursor(fc, field_names) as cursor:
for row in cursor:
for i, v in enumerate(row):
if v is None:
null_counts[field_names[i]] += 1
for field in fields:
print("{0},{1},{2},{3}".format(field.name, field.type, field.length, field.domain, null_counts[field.name]))
Thank you Johannes! The only thing I had to add was an additional {4} under the print statement to include the null count.
Thanks again!
Hayden
I like using the Python API. Mostly because I learned pandas long before I ever touched arcpy, but I find it to be simple and effective. Using a geodatabase, you'll need to be using the Python API in an environment where arcpy is available to you.
Granted, this snipped is pretty much only for the null counts. The Python API can pull out the other details of published services, but for file-based geodatabases, arcpy's the way to get the rest of the details. But I figured I'd throw another idea out there, in case it's helpful.
from arcgis.features import GeoAccessor
import pandas as pd
df = GeoAccessor.from_featureclass('path/to/database.gdb/layername')
pd.DataFrame(df.isna().sum())