Key concepts: nulls, booleans, list comprehensions, ternary operators, condition checking, mini-format language
Null values are permissable when creating tables in certain data structures. I have never had occasion to use them since I personally feel that all entries should be coded with some value which is either:
Null, None etc don't fit into that scheme, but it is possible to produce them, particularly if people import data from spreadsheets and allow blank entries in cells within columns. Nulls cause no end of problems with people trying to query tabular data or contenate data or perform statistical or other numeric operations on fields that contain these pesky little things. I should note, that setting a record to some arbitrary value is just as problematic as the null. For example, values of 0 or "" in a record for a shapefile should be treated as suspect if you didn't create the data yourself.
NOTE: This post will focus on field calculations using python and not on SQL queries..
List Comprehensions to capture nulls
As an example, consider the task of concatenating data to a new field from several fields which may contain nulls (eg. see this thread... Re: Concatenating Strings with Field Calculator and Python - dealing with NULLS). There are numerous...
List comprehensions, truth testing and string concatenation can be accomplished in one foul swoop...IF you are careful.
This table was created in a geodatabase which allows nulls to be created in a field. Fortunately, <null> stands out from the other entries serving as an indicator that they need to be dealt with. It is a fairly simple table, just containing a few numeric and text columns.
The concat field was created using the python parser and the following field calculator syntax.
Conventional list comprehension in the field calculator
# read very carefully ...convert these fields if the fields don't contain a <Null>
" ".join( [ str(i) for i in [ !prefix_txt!, !number_int!, !main_txt!, !numb_dble! ] if i ] )
'12345 some text more'
and who said the expression has to be on one line?
" ".join(
[str(i) for i in
[ !prefix_txt!, !number_int!, !main_txt!, !numb_dble!]
if i ] )
I have whipped in a few extra unnecessary spaces in the first expression just to show the separation between the elements. The second one was just for fun and to show that there is no need for one of those murderous one-liners that are difficult to edit.
So what does it consist of?
You can create your appropriate string without the join but you need a code block.
Simplifying the example
Lets simplify the above field calculator expression to make it easier to read by using variables as substitutes for the text, number and null elements.
List comprehension
>>> a =
12345;
b = None
;
c = "some text";
d = "" ;
e = "more"
>>> " ".join([str(i) for i in [a,b,c,d,e] if i])
One complaint that is often voiced is that list comprehensions can be hard to read if they contain conditional operations. This issue can be circumvented by stacking the pieces during their construction. Python allows for this syntactical construction in other objects such as lists, tuples, arrays and text amongst many objects. To demonstrate, the above expression can be written as:
Stacked list comprehension
>>> " ".join( [ str(i) # do this
... for i in [a,b,c,d,e] # using these
... if i ] ) # if this is True
'12345 some text more'
>>>
You may have noted that you can include comments on the same line as each constructor. This is useful since you can in essence construct a sentence describing what you are doing.... do this, using these, if this is True... A False condition can also be used but it is usually easier to rearrange you "sentence" to make it easier to say-do.
For those that prefer a more conventional approach you can make a function out of it.
Function: no_nulls_allowed
def no_nulls_allowed(fld_list):
"""provide a list of fields"""
good_stuff = []
for i in fld_list:
if i:
good_stuff.append(str(i))
out_str = " ".join(good_stuff)
return out_str
...
>>> no_nulls_allowed([a,b,c,d,e])
'12345 some text more'
>>>
Python's mini-formatting language...
Just for fun, let's assume that the values assigned to a-e in the example below, are field names.
Questions you could ask yourself:
You can generate the required number of curly bracket parameters, { }, needed in the mini-language formatting. Let's have a gander using variables in place of the field names in the table example above. I will just snug the variable definitions up to save space.
Function: no_nulls_mini
def no_nulls_mini(fld_list):
ok_flds = [ str(i) for i in fld_list if i ]
return ("{} "*len(ok_flds)).format(*ok_flds)
>>> no_nulls_mini([a,b,c,d,e])
'12345 some text more '
Ok, now for the breakdown:
Now for code-speak:
Strung together, it means "take all the good values from the different fields and concatenate them together with a space in between"
Head hurt??? Ok, to summarize, we can use simple list comprehensions, stacked list comprehensions and the mini-formatting options
Assume a = 12345; b = None ; c = "some text"; d = "" ; e = "more" | |
---|---|
| a-e represent fields, typical construction advanced construction for an if-else statement, which uses a False,True option and slices on the condition |
| provide a field list to a function, and construct the string from the values that meet the condition |
| a conventional function, requires the empty list construction first, then acceptable values are added to it...finally the values are concatenated together and returned. |
And they all yield.. '12345 some text more'
Closing Tip
If you can say it, you can do it...
list comp = [ do this if this else this using these]
list comp = [ do this # the Truth result
if this # the Truth condition
else this # the False condition
for these # using these
]
list comp = [ [do if False, do if True][condition slice] # pick one
for these # using these
]
A parting example...
# A stacked list comprehension
outer = [1,2]
inner = [2,0,4]
c = [[a, b, a*b, a*b/1.0] # multiply,avoid division by 0, for (outer/inner)
if b # if != 0 (0 is a boolean False)
else [a,b,a*b,"N/A"] # if equal to zero, do this
for a in outer # for each value in the outer list
for b in inner # for each value in the inner list
]
for val in c:
print("a({}), b({}), a*b({}) a/b({})".format(*val )) # val[0],val[1],val[2]))
# Now ... a False-True list from which you slice the appropriate operation
d = [[[a,b,a*b,"N/A"], # do if False
[a,b,a*b,a*b/1.0]][b!=0] # do if True ... then slice
for a in outer
for b in inner
]
for val in d:
print("a({}), b({}), a*b({}) a/b({})".format(*val ))
"""
a(1), b(2), a*b(2) a/b(2.0)
a(1), b(0), a*b(0) a/b(N/A)
a(1), b(4), a*b(4) a/b(4.0)
a(2), b(2), a*b(4) a/b(4.0)
a(2), b(0), a*b(0) a/b(N/A)
a(2), b(4), a*b(8) a/b(8.0)
"""
Pick what works for you... learn something new... and write it down Before You Forget ...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.