Skip navigation
All People > Dan_Patterson > Py... blog > 2016 > November
2016

So ... new interface, time to try out some formatting and stuff.  What a better topic than how to order, structure and view 3D data like images or raster data of mixed data types for the same location or uniform data type where the 3rd dimension represents time.

 

I will make it simple.  Begin with 24 integer numbers and arange them into all the possible configurations in 3D.  Then it is time to mess with your mind and show you how to convert from one arrangement to another.  Sort of like Rubic's Cube, but simpler.

 

So here is the generating script (note the new cool python syntax highlighting... nice! ... but you still can't change the brownish background color, stifling any personal code preferences).  The def just happens to be number 37... it has no meaning, just 37 in a collection of functions

def num_37():
    """(num_37) playing with 3D arrangements...
    :Requires:
    :--------
    :  Arrays are generated within... nothing required
    :Returns:
    :-------
    :  An array of 24 sequential integers with shape = (2, 3, 4)
    :Notes:
    :-----
    :  References to numpy, transpose, rollaxis, swapaxes and einsum.
    :  The arrays below are all the possible combinations of axes that can be
    :  constructed from a sequence size of 24 values to form 3D arrays.
    :  Higher dimensionalities will be considered at a later time.
    :
    :  After this, there is some fancy formatting as covered in my previous blogs.
    """

    nums = np.arange(24)      #  whatever, just shape appropriately
    a = nums.reshape(2,3,4)   #  the base 3D array shaped as (z, y, x)
    a0 = nums.reshape(2,4,3)  #  y, x axes, swapped
    a1 = nums.reshape(3,2,4)  #  add to z, reshape y, x accordingly to main size
    a2 = nums.reshape(3,4,2)  #  swap y, x
    a3 = nums.reshape(4,2,3)  #  add to z again, resize as befor
    a4 = nums.reshape(4,3,2)  #  swap y, x
    frmt = """
    Array ... {} :..shape  {}
    {}
    """

    args = [['nums', nums.shape, nums],
            ['a', a.shape, a], ['a0', a0.shape, a0],
            ['a1', a1.shape, a1], ['a2', a2.shape, a2],
            ['a3', a3.shape, a3], ['a4', a4.shape, a4],
            ]
    for i in args:
        print(dedent(frmt).format(*i))
    return a

 

And here are the results

|-----------------------------------------------------  

|

3D Array .... a 3D array .... a0
Array ... a :..shape  (2, 3, 4)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

[[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

# This is the base array...
Array ... a0 :..shape  (2, 4, 3)
[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]]

[[12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]]]

 

|-----------------------------------------------------
|
In any event, I prefer to think of a 3D array as consisting of ( Z, Y, X ) if they do indeed represent the spatial component.  In this context, however, Z is not simply taken as elevation as might be the case for a 2D raster.  The mere fact that the first axis is denoted with a 2 or above, indicates to me that it is a change array.  Do note that the arrays need not represent anything spatial at all, but this being a place for GIS commentary, there is often an implicit assumption that at least two of the dimensions will be spatial.

 

To go from array a to a0, and conversely, we need to reshape the array.  Array shaping can be accomplished using a variety of numpy methods, including rollaxes, swapaxes, transpose and einsum to name a few.

 

The following can be summarized:

R   rollaxis       - roll the chosen axis back by the specified positions

E   einsum       - for now, just see the swapping of letters in the ijk sequence

S   swapaxes   - change the position of specified axes

T   transpose   - similar to swapaxes, but with multiple changes

 

 

a0 = np.rollaxis(a, 2, 1)           #  a = np.rollaxis(a0, 2, 1)
a0 = np.swapaxes(a, 2, 1)           #  a = np.swapaxes(a0, 1, 2)
a0 = a.swapaxes(2, 1)               #  a = a0.swapaxes(1, 2)
a0 = np. transpose(a, (0, 2, 1))    #  a = np.transpose(a0, (0, 2, 1))
a0 = a.transpose(0, 2, 1)           #  a = np.transpose(a0, 2, 1)
a0 = np.einsum('ijk -> ikj', a)     #  a = np.einsum('ijk -> ikj', a0)

 

When you move on to higher values for the first dimension you have to be careful about which of these you can use, and it is generally just better to use reshape or stride tricks to perform the reshaping

|-----------------------------------------------------
|

3D array .... a13D array .... a2

Array ... a1 :..shape  (3, 2, 4)
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]]])
Array ... a2 :..shape  (3, 4, 2)
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

|-----------------------------------------------------

3D array .... a2 to a conversion
>>> from numpy.lib import stride_tricks as ast
>>> back_to_a = a2.reshape(2, 3, 4)
>>> again_to_a = ast.as_strided(a2, a.shape, a.strides)
>>> back_to_a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> again_to_a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

 

|-----------------------------------------------------

Now for something a little bit different

 

Array 'a' which has been used before.  It has a shape of (2, 3, 4).  Consider it as 2 layers or bands occupying the same space.

array([[[ 0, 1, 2, 3],
        [ 4, 5, 6, 7],
        [ 8, 9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

 

A second array, 'b', can be constructed using the same data, but shaped differently, (3, 4, 2).  The dimension consisting of two parts is effectively swapped between the two arrays.  It can be constructed from:

 

>>> x = np.arange(12)
>>> y = np.arange(12, 24)
>>>
>>> b = np.array(list(zip(x,y))).reshape(3,4,2)
>>> b
array([[[ 0, 12],
        [ 1, 13],
        [ 2, 14],
        [ 3, 15]],

       [[ 4, 16],
        [ 5, 17],
        [ 6, 18],
        [ 7, 19]],

       [[ 8, 20],
        [ 9, 21],
        [10, 22],
        [11, 23]]])

 

If you look closely, you can see that the numeric values from 0 to 11 are order in a 4x3 block in array 'a', but appear as 12 entries in a column, split between 3 subarrays.  The same data can be sliced from their respetive array dimensions to yield

 

... sub-array 'a[0]' or ... sub-array 'b[...,0]'

yields

[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]]

 

The arrays can be raveled to reveal their internal structure.

>>> b.strides # (64, 16, 8)
>>> a.strides # (96, 32, 8)
a.ravel()...[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
b.ravel()...[ 0 12 1 13 2 14 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23]
a0_r = a[0].reshape(3,4,-1) # a0_r.shape = (3, 4, 1)
array([[[ 0],
[ 1],
[ 2],
[ 3]],
[[ 4],
[ 5],
[ 6],
[ 7]],
[[ 8],
[ 9],
[10],
[11]]])

Enough for now.  Learning how to reshape and work with array structures can certainly make dealing with raster data much easier.

Key concepts: nulls, booleans, list comprehensions, ternary operators, condition checking, mini-format language

 

Null values are permissable when creating tables in certain data structures.  I have never had occasion to use them since I personally feel that all entries should be coded with some value which is either:

  • a real observation,
  • one that was missed or forgotten,
  • there is truly no value because it couldn't be obtained
  • other

 

Null, None etc don't fit into that scheme, but it is possible to produce them, particularly if people import data from spreadsheets and allow blank entries in cells within columns. Nulls cause no end of problems with people trying to query tabular data or contenate data or perform statistical or other numeric operations on fields that contain these pesky little things. I should note, that setting a record to some arbitrary value is just as problematic as the null.  For example, values of 0 or "" in a record for a shapefile should be treated as suspect if you didn't create the data yourself.

 

NOTE:  This post will focus on field calculations using python and not on SQL queries..

 

List Comprehensions to capture nulls

As an example, consider the task of concatenating data to a new field from several fields which may contain nulls (eg. see this thread... Re: Concatenating Strings with Field Calculator and Python - dealing with NULLS). There are numerous ways to accomplish this, several will be presented here.

List comprehensions, truth testing and string concatenation can be accomplished in one foul swoop...IF you are careful.

This table was created in a geodatabase which allows nulls to be created in a field.  Fortunately, <null> stands out from the other entries serving as an indicator that they need to be dealt with.  It is a fairly simple table, just containing a few numeric and text columns.

image.png

The concat field was created using the python parser and the following field calculator syntax.

 

Conventional list comprehension in the field calculator

# read very carefully ...convert these fields if the fields don't contain a <Null>
" ".join(  [  str(i) for i in [ !prefix_txt!, !number_int!, !main_txt!, !numb_dble!  ] if i ]  )
'12345 some text more'

 

and who said the expression has to be on one line?

" ".join(
[str(i) for i in
[ !prefix_txt!, !number_int!, !main_txt!, !numb_dble!]
if i ] )

 

table_nulls_03.png

I have whipped in a few extra unnecessary spaces in the first expression just to show the separation between the elements.  The second one was just for fun and to show that there is no need for one of those murderous one-liners that are difficult to edit.

 

So what does it consist of?

  • a join function is used to perform the final concatenation
  • a list comprehension, LC, is used to determine which fields contain appropriate values which are then converted to a string
    • each element in a list of field names is cycled through ( for i in [...] section )
    • each element is check to see if it meets the truth test (that is ... if i ... returns True if the field entry is not null, False otherwise])
    • if the above conditions are met, the value is converted to a string representation for subsequent joining.

You can create your appropriate string without the join but you need a code block.

 

Simplifying the example

Lets simplify the above field calculator expression to make it easier to read by using variables as substitutes for the text, number and null elements.

 

List comprehension

>>> a = 
12345;


b = None
;

c = "some text";

d = "" ;
e = "more"


>>> " ".join([str(i) for i in [a,b,c,d,e] if i])


 

One complaint that is often voiced is that list comprehensions can be hard to read if they contain conditional operations.  This issue can be circumvented by stacking the pieces during their construction.  Python allows for this syntactical construction in other objects such as lists, tuples, arrays and text  amongst many objects.  To demonstrate, the above expression can be written as:

 

Stacked list comprehension

>>> " ".join( [ str(i)               # do this
...           for i in [a,b,c,d,e]   # using these
...           if i ] )               # if this is True
'12345 some text more'
>>>

 

You may have noted that you can include comments on the same line as each constructor.  This is useful since you can in essence construct a sentence describing what you are doing.... do this, using these, if this is True...  A False condition can also be used but it is usually easier to rearrange you "sentence" to make it easier to say-do.

 

For those that prefer a more conventional approach you can make a function out of it.

 

Function: no_nulls_allowed

def no_nulls_allowed(fld_list):
    """provide a list of fields"""
    good_stuff = []
    for i in fld_list:
        if i:
            good_stuff.append(str(i))
        out_str = " ".join(good_stuff)
    return out_str
...
>>> no_nulls_allowed([a,b,c,d,e])
'12345 some text more'
>>>

 

Python's mini-formatting language...

Just for fun, let's assume that the values assigned to a-e in the example below, are field names.

Questions you could ask yourself:

  • What if you don't know which field or fields may contain a null value?
  • What if you want to flag the user that is something wrong instead?

 

You can generate the required number of curly bracket parameters, { }, needed in the mini-language formatting.  Let's have a gander using variables in place of the field names in the table example above.  I will just snug the variable definitions up to save space.

 

Function: no_nulls_mini

 

def no_nulls_mini(fld_list):
    ok_flds = [ str(i) for i in fld_list  if]
    return ("{} "*len(ok_flds)).format(*ok_flds)

>>> no_nulls_mini([a,b,c,d,e])
'12345 some text more '

 

Ok, now for the breakdown:

  • I am too lazy to check which fields may contain null values, so I don't know how many { } to make...
  • we have a mix of numbers and strings, but we cleverly know that the mini-formatting language makes string representations of inputs by defaults so you don't need to do the string-thing ( aka str( ) )
  • we want a space between the elements since we are concatenating values together and it is easier to read with spaces

Now for code-speak:

  • "{} "  - curly brackets followed by a space is the container to put out stuff plus the extra space
  • *len(ok_flds)  - this will multiply the "{} " entry by the number of fields that contained values that met the truth test (ie no nulls)
  • *ok_flds  - in the format section will dole out the required number of arguments from the ok_flds list (like *args, **kwargs use in def statements)

Strung together, it means "take all the good values from the different fields and concatenate them together with a space in between"

 

Head hurt???  Ok, to summarize, we can use simple list comprehensions, stacked list comprehensions and the mini-formatting options

 

Assume  a = 12345; b = None ; c = "some text"; d = "" ; e = "more"

# simple list comprehension, only check for True
" ".join( [ str(i) for i in [a, b, c, d, e]  if]  )
12345 some text more

# if-else with slicing, do something if False
z = " ".join([[str(i),"..."][i in ["",'',None,False]]
              for i in [a,b,c,d,e]])
12345 ... some text ... more

a-e represent fields, typical construction

 

advanced construction for an if-else statement, which uses a False,True option and slices on the condition

def no_nulls_mini(fld_list):
    ok_flds = [ str(i) for i in fld_list  if]
    return ("{} "*len(ok_flds)).format(*ok_flds)
provide a field list to a function, and construct the string from the values that meet the condition
def no_nulls_allowed(fld_list):
    good_stuff = []
    for i in fld_list:
    if i:
        good_stuff.append(str(i))
    out_str = " ".join(good_stuff)
    return out_str

a conventional function, requires the empty list construction first, then acceptable values are added to it...finally the values are concatenated together and returned.

And they all yield..    '12345 some text more'

 

Closing Tip

If you can say it, you can do it...

 

list comp = [ do this  if this  else this using these]

 

list comp = [ do this        # the Truth result

              if this        # the Truth condition

              else this      # the False condition

              for these      # using these

              ]

 

list comp = [ [do if False, do if True][condition slice]  # pick one

              for these                                   # using these

             ]

 

A parting example...

 

# A stacked list comprehension
outer = [1,2]
inner = [2,0,4]
c = [[a, b, a*b, a*b/1.0]  # multiply,avoid division by 0, for (outer/inner)
     if b                # if != 0 (0 is a boolean False)
     else [a,b,a*b,"N/A"]    # if equal to zero, do this
     for a in outer      # for each value in the outer list
     for b in inner      # for each value in the inner list
     ]
for val in c:
    print("a({}), b({}), a*b({}) a/b({})".format(*val )) # val[0],val[1],val[2]))

# Now ... a False-True list from which you slice the appropriate operation
d = [[[a,b,a*b,"N/A"],           # do if False
      [a,b,a*b,a*b/1.0]][b!=0]   # do if True ... then slice
     for a in outer
     for b in inner
     ]
for val in d:
    print("a({}), b({}), a*b({}) a/b({})".format(*val ))
"""
a(1), b(2), a*b(2) a/b(2.0)
a(1), b(0), a*b(0) a/b(N/A)
a(1), b(4), a*b(4) a/b(4.0)
a(2), b(2), a*b(4) a/b(4.0)
a(2), b(0), a*b(0) a/b(N/A)
a(2), b(4), a*b(8) a/b(8.0)
"""

 

Pick what works for you... learn something new... and write it down Before You Forget ...