Hello,
I'm stumped by a problem that I think should have a fairly straightforward answer...
I have a few hundred features with a number of attributes, all numerical. Some features have all attributes filled in, but some have a handful of null values. Is there any way of calculating the mean value for each feature that takes into account these null values? For instance, it calculates how many attributes are not null for each feature and uses that to generate the mean?
Would really appreciate any help on this!
Thanks
Check out this case study. It had to deal with the same issue: Modeling literacy
You can do this with Calculate Field. This is an excerpt from the case study. I hope this is helpful!
Lauren
Once the new fields have been created, you will use the Calculate Field tool to compute the mean values for each dataset. To compute the mean value for the adolescent birth rate dataset, for example, use the parameters below:
def getMean(*allYears):
notNull = list(filter(None,allYears))
theSum = sum(notNull)
theCnt = len(notNull)
return theSum/theCnt
The Code Block defines the getMean function. This is what each Python statement does:
Python Statement | What it does |
---|---|
def getMean(*allYears): | Indicates you want to define a new function called getMean that will use the sequence of values passed to it from the Expression parameter. |
notNull = list(filter(None,allYears)) | This line, indented four spaces, indicates Python should put all of the non-Null values into a list called notNull. |
theSum = sum(notNull) | This line, indented four spaces, instructs Python to sum the non-Null values. |
theCnt = len(notNull) | This line, indented four spaces, instructs Python to count the non-Null values. |
return theSum/theCnt | This line, indented four spaces, sets the value of the Field Name provided to be the mean (the sum divided by the count). |