Much Ado About Nothing

5448
9
05-21-2016 09:12 PM

Much Ado About Nothing

Original, excellent question: (Kevin Milton​)

I am trying to calculate a field where i want to exclude any null values. For example. Field calulate field X with any value that is not null from field Y. I have been messing with python, but just can not seem to get this to run properly.

[What I want to do is:]

If field X <> Null

then field calculate field Y with field X.

Dan Patterson

Nulls of any sort are treated as boolean False

>>> a = ""
>>> if a:
...  print(a)
... else:
...  print("null string")
... 
null string
>>> a = "hello"
>>> if a:
...  print(a)
... else:
...  print("null string")
...
hello
>>>

See my nullness summary in the NumPy Repository​ - from there:

Object.... Empty... Type......
[]        True    list     
[1]        False    list     
()        True    tuple   
1          False    int     
{}        True    dict     
{1: 'one'} False    dict     
''        True    str     
'1'        False    str     
None      True    NoneType 
True      False    bool     
1          False    int     
False      True    bool     
0          True    int     
Object................... Empty... Type...... 
Counter()                True    Counter  
Counter({0: 1})          False    Counter  
OrderedDict()            True    OrderedDict
OrderedDict([(1, '1')])  False    OrderedDict
array([], dtype=float64)  True    ndarray  
array(1)                  False    ndarray  

(There are other examples dealing with geometry)

Joshua Bixby

Python doesn't have a data type called NULL/Null/null.  The Python Built-in Types documentation states for the null object, "There is exactly one null object, named None (a built-in name)."  If you are used to working with NULL in databases and SQL; be careful, None in Python doesn't behave exactly the same as NULL even though None is referred to as "the null object" in some documentation.

In Python there are several falsy values, i.e., values that are not false but evaluate to false in a Boolean context.  Some common examples are None (the NoneType), "" (empty string), [] empty list, () (empty tuple), and 0 (the number zero).  When using the Python parser with the Field Calculator or cursors with ArcPy, NULLs need to be converted, and Esri has chosen to convert them to None, which is understandable given the built-in data types available with Python.

NOTE

Although Python Conditional Expressions have their limits, they can be used in this and similar cases to avoid using a code block:

Calculate Field

Field: X

Expression:

None if !Y! is None else !Y!         # Null Y values are assigned back to Null (Python None will get converted back to Null)

or

-9999 if !Y! is None else !Y!        # Null Y values are assigned -9999
!Y! if !Y! is not None else -9999    # Null Y values are assigned -9999.  Same as above just changes order of conditions.

Ted Kowal

I have been working the last year converting to Python from VBA.  I still have issues with Python's handling of NULLS!  From my perspective results are almost always inconsistent; so my level of trust using Python to handle null is very low.  I almost always use vbscript isnull() function to check.

I learned something new thanks to Dan, I did not realize that nulls are treated as "false".  Coming from a VB background nulls are neither True or False.... this concept may help me with my own inconsistencies with Pythons and Nulls.

VBScript:

If  IsNull(FieldX) Then
  FieldY = 'Calculation on FieldX being Null'
Else
  FieldY = 'Calculation when FieldX is not Null'
End If

Another thing to check or to make sure that your tests includes Blanks,embedded tabs or other control character (see a lot of this when cut and pasting from excel)... these are not nulls although they act and appear to be nulls.  These include all characters with ASCII values between 0 and 31.

I made some tests using Python's of treating Null values as false ....   works great if the data is clean... however on my "dirty tests"  (real life cut/paste from external programs) I would still have to strip all embedded control characters before the Python script would work whereas the vbscript had no issues.

This exercise has shown me that I have been lazy relying on VB to test for nulls... my data is not as clean as I thought it was....

In ArcGIS, NULL in a data table is converted to None in Python.  The equivalent of IsNull in VBA is is None in Python.

if FieldX is None:
  FieldY = # Calculation on FieldX being Null
else:
  FieldY = # Calculation when FieldX is not Null

In pure GIS tables I do not doubt your correctness about ESRI Nulls equating to None.  However, in My real world, as a state authority, we are beggars of data and in many cases have little or no say in the formatting or the QA process the data we receive undergoes.

Using my data, None is useless for testing for nullness ( hit or miss).  The best pythonic approach I have used with my data is using the old fashion trim and concatenation approach to strip out all the Ascii codes 31 or less and final test for a single space or as many here indicated, assign a comparison number or string to represent nullness for the down and dirty.  My clean approach, when I have time is to validate and correct the data before it gets into GIS (Then None and Dan's boolean approach works 100%).

I still state that Python has issues with Nulls.   It quacks like a duck, looks like a duck and feels like a duck -- But it is not a duck!

Curtis Price

Here's my solution: often for this problem you want to put something in there to indicate a null value was there in Y.

Calculate Field

Field:  X

Expression:

CopyNonNull(!Y!)  # Null Y values are assigned None (Null, sort of...)

or

CopyNonNull(!Y!, -9999)  # Null Y values are assigned -9999

Parser:  PYTHON_9.3

Code block:

def CopyNonNull(val, missing_val=None):
  if val is None:
      return missing_val
  else:
      return val

Ted Kowal

In pure GIS tables I do not doubt your correctness about ESRI Nulls equating to None.  However, in My real world, as a state authority, we are beggars of data and in many cases have little or no say in the formatting or the QA process the data we receive undergoes.

Using my data, None is useless for testing for nullness ( hit or miss).  The best pythonic approach I have used with my data is using the old fashion trim and concatenation approach to strip out all the Ascii codes 31 or less and final test for a single space or as many here indicated, assign a comparison number or string to represent nullness for the down and dirty.  My clean approach, when I have time is to validate and correct the data before it gets into GIS (Then None and Dan's boolean approach works 100%).

I still state that Python has issues with Nulls.   It quacks like a duck, looks like a duck and feels like a duck -- But it is not a duck!

Dan Patterson

Ted... still not ready to move on to 'not a number ' until you can accept nulls

>>> a = np.NaN
>>> b = np.NaN
>>> a == b
False

Joshua Bixby

I guess I see it different, i.e., it is a data issue and not a Python issue.

I have worked for government in a variety of positions ranging from planning to information management and now information technology.  I have been on both the producing and consuming side of data, including data problems.  I agree that messy data doesn't fit cleaning into code, regardless of the language.

Regardless of the language (Python, SQL , etc...), most of what you describe is neither Null or None.  An empty string might be falsy, but it certainly isn't Null.  The same holds true for strings with ASCII codes 31 or less.  I am interested in how you would handle these types of situations in other programming languages because I don't really see how this is specifically a Python issue, regardless of whether one wants to argue None is Null or not Null exactly.

David Wasserman

Good to know about np.NaNs for that. I have been running into issues with that at work recently. NP methods for scrubbing NaNs seem best, but I have found masked arrays to be annoying in practice. Pandas seems to handle it a little better on the surface.

Ted Kowal

Dan you just made my day!

I am a LRS shop and NaN will be an issue when Python incorporates dynamic seg and measures.... nice that I have that to look forward to.

Thankfully I wrote all my LRS code in VB.Net/ESRI SDK  which will be around for awhile.  I have read a few interesting papers about the possible redesign of VB, on which will incorporate and built around the HTML5 engine.  If this comes to be, it will be a game changer and breath new life into vb script again!

Richard Fairhurst

I completely agree with Joshua that only real Null values should be found by Python, since only real Null values will be found by SQL if I write an expression that uses IS NULL for any of the databases I use.  Whitespace and control characters are not Null and will not be selected using IS NULL with standard implementations of SQL.  Python follows the SQL standard.  Access queries might return results that are more similar to the VB implementation, but Access is in no way a reliable reference for established database "standards" recognized by professionals.

Dan Patterson

There is only one object in python that is None, it is

>>> type(None)
<type 'NoneType'>

There are, however many null objects, in the mathematical sense (ie lists, sets, dictionaries, collections any container)

Then there are objects that are not, but their are not equal in their none-ness (ie np.NaN)

Their truth equality in testing is another issue...  the idea that None of this matters is False and this is True

And I again, will leave geometry out of this... of which I have posted several times before

David Wasserman

Only thing I might add is this leverages Falsy behavior in python. 2.3.1 Truth Value Testing

Something to keep in mind is that 0 can also be Falsy along with a few other values (empty strings etc).
Using

  • if value: (falsy)
  • is None
  • is not None

or even

  • isinstance() built in

is what I see over and over again. 😃

Dan Patterson

Yes confusing isn't it... you have to know what an object or what it  belongs to... a common mistake in numpy

>>> a = np.NaN
>>> isinstance(a,np.NaN)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types
>>> type(np.NaN)
<type 'float'>

because you can't use zero... and the list goes on

Ted Kowal

There are issues with nulls in all languages even my beloved vb and also with both the Oracle and MS SQL server -- so Python following the SQL standard is not 100% true either because each deals with nullness differently.  So it is not really any specific language problem.  My issue with Python's and Null is mostly personal.  Python reminds me of the Perl scripting language (Which I really hated).  I spend more time with python trying to figure out why an apparent record that both MSSQL ,ACCESS and EXCEL says its null but fails when testing in Python.  In the end I use VB or other language scripts for null testing because I find the results both consistent and repeatable across time.  VB, MS access, MS SQL treat Nulls slightly differently but in those cases I found I could handle the differences with little to no effort.  VB distinguishes between space and nothing.  Whereas MS Access does not except when you use the built in function as well as MS SQL.  Oracle has more of a java/javascript approach to Null.  But Python is very indirect and obscure when dealing with null within the language context (Probably makes Python much more flexible in the long run and can easily test all variants of nullness).  I am probably not yet knowledgeable in Python to adequately determine which of the many variants I should test for.

I think the problem is with the definition of what we consider is NULL.  In my definition an empty string is not null, however nothing and null are same.  Some other engineers I know disagree with my definition stating that nothing is also not null ... to them NULL is the absence of Nothing (that I cannot put my head around).  Other folks will say that the < 31 ASCII Codes are nothing because they are artifacts from buffered cut and paste operation.....

My ideal language tells me if something is null adhering to my expectation of nullness consistently. Python does not do this yet for me.  My custom Python isNull function is constantly growing/being modified ... trying to capture and test for my definition of a null value.

Did a little isNull test using different Languages and DB's for a test record from sqlserver that had a test field containing an ascii <NUL> and <TAB> char

LanguageTestBoolean result
VB6IsNull()False
VB.NETIsNull()False (but: True if DB Context isNull -- go figure)
C#== nullFalse (but: True if DB Context isNull -- go figure)
PythonIs NoneFalse
MS AccessIS NULLTrue
SQL ServerIS NULLTrue
OracleIS NULLFalse

Joshua Bixby

I fear your ideal language is one you will likely have to create yourself.  Everyone is entitled to have his/her own views of the world, but not all views are commonly held.  In this case your idea of null-ness is quite uncommon, which is fine in and of itself, but you should expect to run into plenty of frustrations with most (if not all) computer languages and even possibly some areas of mathematics.

Before getting back to Null, I think saying that ASCII codes < 31 are "nothing because they are artifacts from buffered cut and paste operation" doesn't give enough credit to those characters.  Sure, control characters aren't printable/viewable, but I can't imagine trying to work with computers without control characters.  Typical business users that only enter text into e-mail, word processing, and other business productivity applications might not understand control characters, or even know that such things exist, but it doesn't mean they are nothing.

Regarding nulls, an ASCII null character (NUL or \0) is null within the domain or context of the ASCII character set, but it is still a character.  Type systems can define their own nullable type, but the ASCII null character is not a universal null object/type.  It is possible there is a programming language that has adopted the ASCII null character as its own null representation, but that isn't the case for any of the languages that I interact with.

Regarding SQL Server and MS Access, I get the opposite results you do, i.e., neither returns a field with only an ASCII null when the WHERE condition includes IS NULL.  Looking at a SQL Server example:

--  SELECT RECORDS FROM CTE WHERE f0 is null
WITH cte AS (
    SELECT * FROM (
        VALUES
            (1, NULL),     -- SQL SERVER NULL
            (2, CHAR(0)),  -- ASCII NULL
            (3, CHAR(9)),  -- ASCII TAB
            (4, CHAR(13)), -- ASCII CR
            (5, CHAR(33))  -- ASCII "!"
        ) AS t (id, f0)
    )
SELECT * from cte where f0 is null;



id          f0

----------- ----

1           NULL

(1 row(s) affected)

Ted Kowal

I do get the same results running the isnull script in SQL server as you, however when I import a record from excel with embedded codes, the isnull function returned true which differs from you singular controlled value list.

In any event, I whole-heartedly agree with your statements, the other definitions I stated were not necessarily my own but view points of folks that supply me with data.  I do not infer that the ASCII null is a universal standard of nullness, but gave it as one example of where the simple is none python test fails (for me!!!!! ).  I believe my issue is not one of testing for null, but in data standards in which I am struggling to employ in-house as well as data received from the outside (lots of resistance in my case).  I won't even go into uni-text,ut8 and binary strings/null issues...


This document was generated from the following discussion:

I am trying to calculate a field where i want to exclude any null values. For example. Field calulat...

Comments

I thought this was an awesome thread. I referred to this thread recently as what happens when a geonet question transitions to open bar. Great name for it Curtis Price

Curtis Price​, yes, thanks for collating the various comments from the original question.  I was thinking it would be good to do, and I am glad someone stepped up and did it.

Just a quick attribution comment.  What is currently the second from the bottom attribution to me, starts with "Dan you just made my day!," is really a comment from Ted.

So after reading this entire post, maybe i missed the answer but is there a solution for testing for the various forms of nothing?

It is buried in the thread, but some main methods are:

1. Falsy behavior: (More General)
if *variable*:
    return "Variable is not None and thus True"
else:
    return "Variable is None or some other falsy object"
2. Is operator: Evaluates to true if the variables on either side of the operator point to the same object and false otherwise.

if *variable* is None: (More specific)

   return "Variable is None and thus True"

else:

   return "Variable is Not None"
3. isinstance() function: (Also more specific)
if isinstance(*variable*,None):

  return "Variable is None and thus True"

else:

   return "Variable is Not None"

There are other types of "None" types in other languages, being called everything from Nil, Null, None, NaN, and in numpy np.NaN. Numpy NaNs usually require specific numpy operators to find np.NaNs. There are other methods mentioned in the thread, this is just a small sample.

I am open to anyone writing a short summary at the top. Do others have edit access? Is there a way to grant it if they don't?

Joshua Bixby​ could you add '' to your test script in SQL Server, and see what happens?

         (6, '')  -- empty string 

to be philosophical.... None is not nothing... it is something, just one of a kind.

as for np.NaN be careful...

np.NaN == np.NaN

False

None == None

True

>>> type(None)

<type 'NoneType'>

Curtis edit the document then at the bottom change the editor options

Creating Collaborative Documents

I want to add one more FieldCalulator option for when you want things to be a true null (edit...in the eyes of ArcGIS 10.x.x). In my case, I was not testing for Null, but wanted to use FieldCalculator to replace empty-ish fields.  I knew about this doc and read thru is several times before I got the python version to work in FieldCalulator (the VB was easier).

Just fyi: My delema was caused by me using using TableToExcel, editing sorting uneven columns, then ExcelToTable which caused my emtpy Excel fields to be empty (maybe ""), but not NULL.  When ussing these columns in my script with "TableToDomain_management", when it hit one of these empty fields, I would get the error "The value being added to the CodedValueDomain already exists. [domain name: gmuStRag1submajoy, code name: ]".  Many hours later, trying to figure out whay I screwed up in the program lead me to the issue with not having null in the fields, which the TableToDomain_management would ignore.

FYI - if using interactive FieldCalculator for VB or PYTHON, quick trick, edit the table in ArcMap, select the records you need to assign null, calc the field =

    Null     if using VB

    None     if using PYTHON_93

....no quotes of anything else...Figured out the None from the document above.  so...thanks!

# And for a python script.. expression_type="PYTHON_93"

# Replace a layer/table view name with a path to a dataset (which can be a layer file) or create the layer/table view within the script

# The following inputs are layers or table views: "testingNull"

arcpy.CalculateField_management(in_table="testingNull",field="submajor2",expression="None",expression_type="PYTHON_9.3",code_block="#")

# Or... expression_type="VB"

# Replace a layer/table view name with a path to a dataset (which can be a layer file) or create the layer/table view within the script

# The following inputs are layers or table views: "testingNull"

arcpy.CalculateField_management(in_table="testingNull",field="submajor1",expression="NULL",expression_type="VB",code_block="#")

#  sorry....could find the advanced key to format it for python.

This document was very helpful, but want to add this as a comment, for others that may run into this and need to calculate the value to Null.  Curtis, feel free to snip anything out and add to the doc if you feel it helps....otherwise, hopefully this can help others at some point.  I'm tagging my blog so maybe I can find it again.  myBlog

Version history
Last update:
‎12-12-2021 03:43 AM
Updated by: