How to field calculate the mean of a field array?

3550
20
07-07-2017 05:14 AM
LukeHaskett
New Contributor

Hello,

I'm stumped by a problem that I think should have a fairly straightforward answer...

I have a few hundred features with a number of attributes, all numerical. Some features have all attributes filled in, but some have a handful of null values. Is there any way of calculating the mean value for each feature that takes into account these null values? For instance, it calculates how many attributes are not null for each feature and uses that to generate the mean?

Would really appreciate any help on this!

Thanks

0 Kudos
20 Replies
DanPatterson_Retired
MVP Emeritus

what is the source type of your data? Featureclass tables? dbase? excel? 

And I assume you want to automate the whole process rather than doing the process table by table and field by field. 

You could use numpy, and convert the tables to numpy arrays and do the columns all at once.

#..... stuff before includes FeatureclassToNumPyArray or TabletoNumPyArray
#      to get the table into an array... Then use nanmean to the mean
b
Out[6]: 
array([[ nan,   5.,  nan,  nan,   5.],
       [ nan,  nan,   8.,   6.,  nan],
       [  5.,  nan,   7.,   4.,   7.],
       [ nan,   9.,  nan,   9.,  nan],
       [ nan,  nan,   4.,   9.,   6.],
       [ nan,   4.,  nan,   6.,   5.]])

np.nanmean(b, axis=1)
Out[7]: array([ 5.        ,  7.        ,  5.75      ,  9.        ,  6.33333333,  5.        ])

Then you could cycle through the tables and get the mean

PS  'nan' is the numpy equivalent of None in python and nodata in tables.

LukeHaskett
New Contributor

Thanks Dan, I'm not sure I fully follow you. The data source is a featureclass attribute table and ideally I'd be looking to use the Field Calculator or a python script to find the means. Unfortunately, I really don't follow the code block you've copied in, can you suggest how I might make use of it? I had thought numpy might be the answer.

0 Kudos
JoeBorgione
MVP Emeritus

Fairly simple database manipulation; you want the mean that includes the number records where field-of-choice is null, right?  Try this: right click on the field name and select Statistics.  That will give you mean.  Select where field-of-choice is null.  Calculate the field-of-choice value = mean from the Statistics results.

With all due respect to Dan and his elegant solution, sometimes going with the basics can be useful too....

Dan_Patterson

That should just about do it....
0 Kudos
LukeHaskett
New Contributor

Hi Joe, I'm looking for statistics to be done on a field array, not only one field. So each feature has ten fields that may or may not be null... I want a script that figures out how many are null to then sum and correctly produce the mean (e.g. if only 8 of the 10 fields are scored, the sum of those will be divided by 8). If you're answer addresses this, please could you rephrase it as I don't follow it.

0 Kudos
DanPatterson_Retired
MVP Emeritus

I had thought of that of course.. but I think Luke has multiple fields in a table, and there is a need to do this all at once, rather than one field at a time, hence, the all fields all at once solution, which can then be wrapped in an all files, for all fields all at once solution.

Of course, the problem could be simpler than assumed

0 Kudos
JoeBorgione
MVP Emeritus

Yep...  I get it..  But there are so many "it's gotta be automated, but I can't describe it" types of questions here....

That should just about do it....
0 Kudos
LukeHaskett
New Contributor

I am trying the following but still not having any success:

def mean(Welcoming, Access, Community, Safe, Provision, Quality, Security, Dog, Litter, Grounds):
    fieldList = [Welcoming, Access, Community, Safe, Provision, Quality, Security, Dog, Litter, Grounds]
    validList = []
    for i in fieldList:
        if i != None:
            validList.append(i)
    meanVal = sum(validList)/len(validList)
    return meanVal‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

I get an error that 9 required positional arguments are missing. Any suggestions please?

[edited to use Python syntax highlighting, with proper indent - VA]

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

How are you calling your function from the field calculator?  Obviously the function definition is going into the code block, but what are you putting into the expression box.  The error message makes me think you are calling your function incorrectly.

LukeHaskett
New Contributor

Hi Joshua, I'm putting this in:

mean([ !Welcoming!, !Access!, !Community!, !Safe!, !Provision!, !Quality!, !Security!, !Dog!, !Litter!, !Grounds! ])

0 Kudos