... Those pesky null things ...

DanPatterson_Retired · ‎11-03-2016

Key concepts: nulls, booleans, list comprehensions, ternary operators, condition checking, mini-format language

Null values are permissable when creating tables in certain data structures. I have never had occasion to use them since I personally feel that all entries should be coded with some value which is either:

a real observation,
one that was missed or forgotten,
there is truly no value because it couldn't be obtained
other

Null, None etc don't fit into that scheme, but it is possible to produce them, particularly if people import data from spreadsheets and allow blank entries in cells within columns. Nulls cause no end of problems with people trying to query tabular data or contenate data or perform statistical or other numeric operations on fields that contain these pesky little things. I should note, that setting a record to some arbitrary value is just as problematic as the null. For example, values of 0 or "" in a record for a shapefile should be treated as suspect if you didn't create the data yourself.

NOTE: This post will focus on field calculations using python and not on SQL queries..

List Comprehensions to capture nulls

As an example, consider the task of concatenating data to a new field from several fields which may contain nulls (eg. see this thread... Re: Concatenating Strings with Field Calculator and Python - dealing with NULLS). There are numerous...

List comprehensions, truth testing and string concatenation can be accomplished in one foul swoop...IF you are careful.

This table was created in a geodatabase which allows nulls to be created in a field. Fortunately, <null> stands out from the other entries serving as an indicator that they need to be dealt with. It is a fairly simple table, just containing a few numeric and text columns.

The concat field was created using the python parser and the following field calculator syntax.

Conventional list comprehension in the field calculator

# read very carefully ...convert these fields if the fields don't contain a <Null>
" ".join(  [  str(i) for i in [ !prefix_txt!, !number_int!, !main_txt!, !numb_dble!  ] if i ]  )
'12345 some text more'
‍‍‍

and who said the expression has to be on one line?

" ".join(
[str(i) for i in 
[ !prefix_txt!, !number_int!, !main_txt!, !numb_dble!]
 if i ] )
‍‍‍‍

I have whipped in a few extra unnecessary spaces in the first expression just to show the separation between the elements. The second one was just for fun and to show that there is no need for one of those murderous one-liners that are difficult to edit.

So what does it consist of?

a join function is used to perform the final concatenation
a list comprehension, LC, is used to determine which fields contain appropriate values which are then converted to a string
- each element in a list of field names is cycled through ( for i in [...] section )
- each element is check to see if it meets the truth test (that is ... if i ... returns True if the field entry is not null, False otherwise])
- if the above conditions are met, the value is converted to a string representation for subsequent joining.

You can create your appropriate string without the join but you need a code block.

Simplifying the example

Lets simplify the above field calculator expression to make it easier to read by using variables as substitutes for the text, number and null elements.

List comprehension

>>> a = 
12345;
 

b = None
 ; 

c = "some text"; 

d = "" ; 
e = "more"


>>> " ".join([str(i) for i in [a,b,c,d,e] if i])


‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

One complaint that is often voiced is that list comprehensions can be hard to read if they contain conditional operations. This issue can be circumvented by stacking the pieces during their construction. Python allows for this syntactical construction in other objects such as lists, tuples, arrays and text amongst many objects. To demonstrate, the above expression can be written as:

Stacked list comprehension

>>> " ".join( [ str(i)               # do this
...           for i in [a,b,c,d,e]   # using these
...           if i ] )               # if this is True
'12345 some text more'
>>>
‍‍‍‍‍

You may have noted that you can include comments on the same line as each constructor. This is useful since you can in essence construct a sentence describing what you are doing.... do this, using these, if this is True... A False condition can also be used but it is usually easier to rearrange you "sentence" to make it easier to say-do.

For those that prefer a more conventional approach you can make a function out of it.

Function: no_nulls_allowed

def no_nulls_allowed(fld_list):
    """provide a list of fields"""
    good_stuff = []
    for i in fld_list:
        if i:
            good_stuff.append(str(i))
        out_str = " ".join(good_stuff)
    return out_str
... 
>>> no_nulls_allowed([a,b,c,d,e])
'12345 some text more'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍

Python's mini-formatting language...

Just for fun, let's assume that the values assigned to a-e in the example below, are field names.

Questions you could ask yourself:

What if you don't know which field or fields may contain a null value?
What if you want to flag the user that is something wrong instead?

You can generate the required number of curly bracket parameters, { }, needed in the mini-language formatting. Let's have a gander using variables in place of the field names in the table example above. I will just snug the variable definitions up to save space.

Function: no_nulls_mini

def no_nulls_mini(fld_list):
    ok_flds = [ str(i) for i in fld_list  if i  ]
    return ("{} "*len(ok_flds)).format(*ok_flds)

>>> no_nulls_mini([a,b,c,d,e])
'12345 some text more '
‍‍‍‍‍‍

Ok, now for the breakdown:

I am too lazy to check which fields may contain null values, so I don't know how many { } to make...
we have a mix of numbers and strings, but we cleverly know that the mini-formatting language makes string representations of inputs by defaults so you don't need to do the string-thing ( aka str( ) )
we want a space between the elements since we are concatenating values together and it is easier to read with spaces

Now for code-speak:

"{} " - curly brackets followed by a space is the container to put out stuff plus the extra space
*len(ok_flds) - this will multiply the "{} " entry by the number of fields that contained values that met the truth test (ie no nulls)
*ok_flds - in the format section will dole out the required number of arguments from the ok_flds list (like *args, **kwargs use in def statements)

Strung together, it means "take all the good values from the different fields and concatenate them together with a space in between"

Head hurt??? Ok, to summarize, we can use simple list comprehensions, stacked list comprehensions and the mini-formatting options

Assume a = 12345; b = None ; c = "some text"; d = "" ; e = "more"

# simple list comprehension, only check for True
" ".join( [ str(i) for i in [a, b, c, d, e]  if i  ]  )
12345 some text more

# if-else with slicing, do something if False
z = " ".join([[str(i),"..."][i in ["",'',None,False]] 
              for i in [a,b,c,d,e]])
12345 ... some text ... more

‍‍‍‍‍‍‍‍‍

a-e represent fields, typical construction

advanced construction for an if-else statement, which uses a False,True option and slices on the condition

def no_nulls_mini(fld_list):
    ok_flds = [ str(i) for i in fld_list  if i  ]
    return ("{} "*len(ok_flds)).format(*ok_flds)
‍‍‍

provide a field list to a function, and construct the string from the values that meet the condition

def no_nulls_allowed(fld_list):
    good_stuff = []
    for i in fld_list:
    if i:
        good_stuff.append(str(i))
    out_str = " ".join(good_stuff)
    return out_str

‍‍‍‍‍‍‍‍

a conventional function, requires the empty list construction first, then acceptable values are added to it...finally the values are concatenated together and returned.

And they all yield.. '12345 some text more'

Closing Tip

If you can say it, you can do it...

list comp = [ do this if this else this using these]

list comp = [ do this # the Truth result

if this # the Truth condition

else this # the False condition

for these # using these

]

list comp = [ [do if False, do if True][condition slice] # pick one

for these # using these

]

A parting example...

# A stacked list comprehension
outer = [1,2]
inner = [2,0,4]
c = [[a, b, a*b, a*b/1.0]  # multiply,avoid division by 0, for (outer/inner)
     if b                # if != 0 (0 is a boolean False)
     else [a,b,a*b,"N/A"]    # if equal to zero, do this
     for a in outer      # for each value in the outer list
     for b in inner      # for each value in the inner list
     ]
for val in c:
    print("a({}), b({}), a*b({}) a/b({})".format(*val )) # val[0],val[1],val[2]))

# Now ... a False-True list from which you slice the appropriate operation
d = [[[a,b,a*b,"N/A"],           # do if False
      [a,b,a*b,a*b/1.0]][b!=0]   # do if True ... then slice
     for a in outer
     for b in inner
     ]
for val in d:
    print("a({}), b({}), a*b({}) a/b({})".format(*val )) 
"""
a(1), b(2), a*b(2) a/b(2.0)
a(1), b(0), a*b(0) a/b(N/A)
a(1), b(4), a*b(4) a/b(4.0)
a(2), b(2), a*b(4) a/b(4.0)
a(2), b(0), a*b(0) a/b(N/A)
a(2), b(4), a*b(8) a/b(8.0)
"""
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Pick what works for you... learn something new... and write it down Before You Forget ...