2016

### Py... blog

January 2016 # Rasters and arrays:  backgrounder on shapes and dimensions

Posted by Dan_Patterson Jan 20, 2016

Introduction

This blog post is for the two other people in the world that might be remotely interested in rasters/arrays and dimensionality as it can be applied to reshaping for subsequent analysis.  The topic was introduced in my previous blog about block statistics.  I am gently warming the readers (both of you) to moving/sliding window functions and how they work on a mechanical level.

The code below is for the record.  This is one incarnation of what is out there on the web.  Most of the comments and posts deal with the 'speed' and efficacy of determining what you will see.  In my opinion, those discussions don't matter... I have only needed primes and combinations and permutations once outside of an academic environment.  I will rely on this code should the need arise again.  So here it is with an output example.

Nerd Note:

As of worlds largest prime as of this date,   , the worlds largest prime is ... 274207281-1 .... or 22,338,618 digits the number is called M74207281

which is the 49th Mersenne Prime discovered.     http://www.mersenne.org/primes/?press=M74207281

In the next post, I will use this foundation to provide more background on working with block and moving window operations.

``"""Script:  primes_combos_demo.pyAuthor:  Dan.Patterson@carleton.caPurpose: Long forgotten, but it is done."""import numpy as npimport itertoolsdef primes(n):    '''Returns a array of primes, p < n     http://stackoverflow.com/questions/2068372/fastest-way-to-list-    all-primes-below-n-in-python/3035188#3035188    '''    assert n>=2    sieve = np.ones(n/2, dtype=np.bool)    for i in range(3,int(n**0.5)+1,2):        if sieve[i/2]:            sieve[i*i/2::i] = False    p_num = np.r_[2, 2*np.nonzero(sieve)[1::]+1]    return p_numdef divisor(n, prn=False):    """Determine the divisors, products, cumulative products and       divisors given a set of primes < n    """    p_vals = primes(n)    divs = [p for p in p_vals if (n % p == 0)]    cump = np.cumproduct(divs).tolist()    divs = divs + cump    prods = list(set(i*j for i in divs for j in divs))    all_div = list(set(divs+prods))    all_div.sort()    if prn:        print("all",all_div)    return all_div def perm_sums(n,max_dim=3):    """Determine the permutations that sum to n for an array of given size.       A verbose workflow follows...   """    ##divs, subs, prods = divisor(n)    perms = divisor(n)    p =[]    for i in range(1,max_dim+1):        p.append(list(itertools.product(perms, repeat=i)))    combos = []    for i in range(len(p)):        vals = [v for v in p[i] if np.product(v)==n ]        combos.extend(vals) #.append(vals)    perms = np.array(perms)    ppp = np.product(perms,axis=-1)       wh = np.where(ppp == n)    perm_vals = perms[wh]    return combosdef demo(n=36, prn=True):    """ demo perm_sums"""    p = primes(n)    print("\nPrimes:...{}".format(p))    all_div = divisor(n)    print("Number:...{}\nAll:...{}".format(n, all_div))    ndim = 4    combos = perm_sums(n, max_dim=ndim)    if prn:        print("\nProducts for... {} - max_dim = {}".format(n, ndim))        for i in combos:            print(i)    return combos  if __name__=="__main__":    """primes demo"""    n = 36    combos = demo(n, prn=True)``

For the example in my last post for a 6x6 array for 36 unique numbers, these list the possible ways that that data can be reconfigure.

Primes:...[ 2  3  5  7 11 13 17 19 23 29 31]
Number:...36
All:...[2, 3, 4, 6, 9, 12, 18, 36]

Products for... 36 - max_dim = 4

2D3D4D

(36,)

(2, 18)

(3, 12)

(4, 9)

(6, 6)

(9, 4)

(12, 3)

(18, 2)

(2, 2, 9)

(2, 3, 6)

(2, 6, 3)

(2, 9, 2)

(3, 2, 6)

(3, 3, 4)

(3, 4, 3)

(3, 6, 2)

(4, 3, 3)

(6, 2, 3)

(6, 3, 2)

(9, 2, 2)

(2, 2, 3, 3)

(2, 3, 2, 3)

(2, 3, 3, 2)

(3, 2, 2, 3)

(3, 2, 3, 2)

(3, 3, 2, 2)

If you specify a minimum window size you can get possible combinations.

Minimum window...[3, 3]

[(3, 12), (4, 9), (6, 6), (9, 4), (12, 3), (2, 3, 6), (2, 6, 3), (3, 3, 4), (3, 4, 3), (4, 3, 3), (2, 2, 3, 3)]

That's all for now.  The cases of being able to reshape a raster in 1, 2, 3 or 4 dimensions are given above. # Block functions in rasters...

Posted by Dan_Patterson Jan 18, 2016

### Block functions in rasters ...

Modified Jan 17,2015

The background..Results...

To begin, we will construct a simple array consisting of 36 numbers from 0 and 35.

The array is broken down into 3x3 blocks of numbers as can be seen in the output array sample.

Block functions are not used as often as they could be in GIS.  This type of function moves or "jumps" by steps in the X and Y direction specified by the block size.

Moving or sliding window functions are more common in GIS.  The window slides by one cell in the X and Y direction regardless of the shape, size and configuration of the window.

The block functions will be demonstrated in this post since they are simpler to comprehend.  The guiding principles for both types of functions are the same, differing only in their movement characteristics.

The input raster to the right, is being represented in array format so that one can see the values that each "cell" represents.  Each cell is of equal dimensions, varying only in the value that it possesses.  When using block function, the raster is tiled into subarrays/rasters using a "window".  In this example, the simple 3x3 window will be used.  Since the raster is to be broken down by this window in a jumping fashion, we end up... nicely ... nicely with 4 sub rasters each 3x3 in size.  Each sub-raster/array consists of 9 values.  It should be noted, that they could contain nodata values which should not be used in calculations.  Currently all sub-rasters all contain valid integer values

The values for the statistics shown represent the values for each array sub-block.  For instance, the minimum has values of 0, 3, 18, and 21.  If you examine the original array, or the samples sub-raster/arrays, you will see these values are indeed correct.  The difference is that each number now represents the 9 cells making up the 3x3 block.

The statistical parameters were derived using numpy functions and are shown below without explanation (for now).  it is suffice to say, that the input array represents dimension 0 and 1 and each sample array represents dimension 2 and 3, ergo, the values are calculated on the last 2 dimensions so that they represent each block.

The formulae for the code-ophiles are as follows:

a_min = a.min(axis=(2,3))           a_max = a.max(axis=(2,3))

a_mean = a.mean(axis=(2,3))     a_sum = a.sum(axis=(2,3))

a_std = a.std(axis=(2,3))             a_var = a.var(axis=(2,3))

a_ptp = a_max - a_min

There are the equivalents for the above for rasters/arrays that contain nodata values (ie. np.nanmin, np.nanmax, np.nanmean, np.nansum, np.nanstd, np.nanvar).

Now, if we take the results for the minimum, we can expand it back to its original form by expanding is both the X and Y directions.

expanded = arr_expand(a_min,3,3)

There are the minor omissions of dealing with nodata values and input arrays that aren't evenly divisable by the window size...but they are minor details and will be covered elsewhere.

The observer should also note, that the array can be exported to a comma, space or tab delimited file  and given an ascii format header to give it real world coordinates for use/import into arcmap.  The whole process can also be accompanied with useage of RasterToNumPy array for export to NumPy and NumPyArrayToRaster if desired.  The subarray types can also be unraveled and presented in tabular format, concatenated and brought back into arcmap for use in graphing or in processes involving other arrays.

Input array...

[[ 0  1  2  3  4  5]

[ 6  7  8  9 10 11]

[12 13 14 15 16 17]

[18 19 20 21 22 23]

[24 25 26 27 28 29]

[30 31 32 33 34 35]]

Output array sample...

[[[[ 0  1  2]

[ 6  7  8]

[12 13 14]]

[[ 3  4  5]

[ 9 10 11]

[15 16 17]]]

[[[18 19 20]

[24 25 26]

[30 31 32]]

[[21 22 23]

[27 28 29]

[33 34 35]]]]

Results...............

Minimum...a_min

[[ 0  3]

[18 21]]

Maximum...a_max

[[14 17]

[32 35]]

Mean...a_mean

[[  7.  10.]

[ 25.  28.]]

Sum...a_sum

[[ 63  90]

[225 252]]

Std...a_std

[[ 4.97  4.97]

[ 4.97  4.97]]

Var...a_var

[[ 24.67  24.67]

[ 24.67  24.67]]

Range...a_ptp

[[14 14]

[14 14]]

expanded array...expanded

[[ 0  0  0  3  3  3]

[ 0  0  0  3  3  3]

[ 0  0  0  3  3  3]

[18 18 18 21 21 21]

[18 18 18 21 21 21]

[18 18 18 21 21 21]]

Summary: .more examples will be given in the upcoming weeks... # What is standard and not ... in python

Posted by Dan_Patterson Jan 16, 2016

Some more information on finding python, its modules and whether they are built-in, modules of the standard distribution or those in the site-packages folder

References:

29.12. inspect — Inspect live objects — Python 3.4.4 documentation look at the table on Types and Members

http://stackoverflow.com/questions/247770/retrieving-python-module-path just one of hundreds of examples.

TypeAttributeDescription
module__doc__documentation string
__file__filename (missing for built-in modules)

TipsExample

A standard installation compatible with ArcMap and arcpy is assumed.

These:

• dir(module_name)
• help(module_name)

Look for a __file__ property in the function/property list.

The os module does...

``>>> import  os>>> dir(os)[ ... snip ... '__doc__', '__file__',  '__loader__', '__name__', ... snip ... ]>>> os.__file__'C:\\Python34\\lib\\os.py'>>>``

The time module doesn't

``>>> import time>>> dir(time)[ ... snip ... '__doc__', '__file__',  '__loader__', '__name__', ... snip ... ]>>> time.__file__   # I will try anywayTraceback (most recent call last):  File "<interactive input>", line 1, in <module>AttributeError: 'module' object has no attribute '__file__'``

Modules in the

C:\PythonXX\lib\

folder will have a __file__ property.

``>>> # from ... C:\Python34\Lib   ... folder>>> import collections>>> collections.__file__'C:\\Python34\\lib\\collections\\__init__.py'``
Modules that aren't part of the standard distribution and have to be installed via pip or other means, get put in the site-packages folder and they will have a __file__

``>>> import arcpy>>> arcpy.__file__'C:\\Program Files\\ArcGIS\\Pro\\Resources\\ArcPy\\arcpy\\__init__.py'>>>``

And my fav...

``>>> import numpy as np>>> np.__file__'C:\\Python34\\lib\\site-packages\\numpy\\__init__.py'``
You can use the inspect module as well... all the cool programmers do ...

Numpy was previously imported as np.

Notice the results are the same as the np.__file__ approach.

``>>> import inspect>>> inspect.getfile(np)'C:\\Python34\\lib\\site-packages\\numpy\\__init__.py'>>>``

Lets try time.

``>>> inspect.getfile(time)Traceback (most recent call last):  File "<interactive input>", line 1, in <module>  File "C:\Python34\lib\inspect.py", line 518, in getfile    raise TypeError('{!r} is a built-in module'.format(object))TypeError: <module 'time' (built-in)> is a built-in module>>>``

No... it is a built-in.

But... you can another way...

``>>> import time>>> inspect.getmodule(time)<module 'time' (built-in)>>>>`` # Circular mean for directional data

Posted by Dan_Patterson Jan 10, 2016

Original:    Aug 2013

Modified:  Sept 2015  to include Numpy version

Reposted:  Jan 2016

General references

-   SciPy implementation  scipy/morestats.py at master · scipy/scipy ·GitHublines 2601 - 2642  using complex numbers

When working with angles or time (minutes, days, months or years) the mean (average) of a list of values can't be determined in the same fashion as we normally do.  This example show one how to calculate the values of angles (azimuths specifically).  The function can be used in any script once the circular_mean.py script is imported.

I will get around to other descriptive measures for circular data but they are covered in the references

```"""
CircularMean
Author: Dan.Patterson@carleton.ca
Date:     Aug 2013
Modified: Sept 2015
Purpose:
To calculate the mean (average) for points along a circle using the
angular measure to these points from the center of the circle
Notes:
If the angles are given in degrees then there is no need to convert
If the units are time, such as hours (24 hour clock) or months (12) then
these need to be converted by
angle in degrees = time quantity / number of equal intervals
ie 0.5 = (12 hours / 24 hours per day)
or 180 degrees
required modules
numpy, math
"""
import math
import numpy as np
def circ_mean(angles):
"""angles in degrees"
"""
cosList = []
cosSum = 0.0
sinSum = 0.0
for i in angles:
cosSum += theCos
sinSum += theSin
N = len(angles)
C = cosSum/N
S = sinSum/N
theMean = math.atan2(S,C)
if theMean < 0.0:
return theMean
def circ_mean_np(angles,azimuth=True):
""" numpy version of above"""
if azimuth:
ang_deg = np.mod(ang_deg,360.)
if __name__ == "__main__":

sampleData = [ [350,10],
[350, 5],
[355, 10],
[90,270],
[180,0],
[180,360],
[0,360],
[90,180,270]
]
print("\nCircular Mean Demo...python version\n")
frmt ="{:15s}{:15s}{:15s}"
for angles in sampleData:
theMean = circ_mean(angles)
args = (angles, theMean, math.degrees(theMean))
print("{!s:>15}{:>12.4}{:>15.4}".format(*args))
print("\nCircular Mean Demo...numpy version\n")
frmt ="{:15s}{:15s}{:15s}"
#
for angles in sampleData:
result = circ_mean_np(angles)
print("{!s:>15}{:>12.4f}{:>15.1f}".format(angles, result,result))
```

Results from both

```angles         Mean radians   Mean degrees
[350, 10]       6.283          360.0
[350, 5]        6.24          357.5
[355, 10]     0.04363            2.5
[90, 270]       3.142          180.0
[180, 0]       1.571           90.0
[180, 360]       4.712          270.0
[0, 360]       6.283          360.0
[90, 180, 270]       3.142          180.0

```

By date: By tag: