Python Blog - Page 18

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Latest Activity

(198 Posts)
DanPatterson_Retired
MVP Emeritus

Created:   2016-03-01

Modified:  2016-08-13

Attachments:

    Script attached in zip file below.

References:

This is a work in progress

This example shows how to generate column formats from the data itself and how to generate a header from the tabular information.  The simplest way to obtain a useful table for manipulation in other software is to use the FeatureclassToNumPyArray or TableToNumPyArray methods in arcpy.  A sample output for such a table is shown in the example.

In the example below, the following procedure was used:

  • the first row of the table is used to obtain the column widths and data type
  • an appropriate format string was used to create the column structure... NOTE. as in the previous example a vertical bar, |, is used to denote the column positional start and the underscore, _, to denote preceding or trailing spaces.
  • the format string and the number of columns in the table creates the line format,
  • combining that information with column width gives a resultant column format
  • an option to use the maximum size between the column name and data size is used to modify the values of column width.
  • finally... a column header...generated from the input data... is produced and the line is printed.

The next example will delve into how to do this all in one swoop giving the option to use column widths based upon data size with the columns and/or whether to maintain column size, alignment, record delimiters and whether to include ancillary information related to the table.

Format demo

Input array....

:.... [(0, [300000.0, 5025000.0], 12345, 1.2345, b'string', 'unicode')]

format string: |_{{!s:<{}}}_

column widths: [2, 21, 5, 6, 9, 7]

column type:   ['int32', 'ndarray', 'int32', 'float64', 'bytes_', 'str_']

data dtype:    ['i', 'f', 'i', 'f', 'S', 'U']

line format.

...  |_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_|_{{!s:<{}}}_

resultant:..|_{!s:<2}_|_{!s:<21}_|_{!s:<5}_|_{!s:<6}_|_{!s:<9}_|_{!s:<7}_


Column header... |.ID.|..........XY...........|..C_0..|..C_1...|....C_2....|...C_3...

Output line..... |_0 _|_[  300000.  5025000.]_|_12345_|_1.2345_|_b'string'_|_unicode_


More later...

more
1 2 3,823
DanPatterson_Retired
MVP Emeritus

The other blog on list comprehensions (or comprehensions more generally) is getting to big to edit and maintain easily, so consider this part 2.  Part I can be found at List comprehensions...

Standard field calculator function

Code block

def func(a,b):
    if (a == -99):
        if (b <> -99):
            ans = -97
        else:
            ans = -99
    else:
        if (b == -99):
            ans = -98
        else:
            ans = float(a) - b
    return ans

Expression box    func(( !Fld_0!, !Fld_1!))

Readings:

Review:

  • [ list comprehension ]
    • lc = [ x for x in 'abcd']
    • ['a', 'b', 'c', 'd']
  • ( generator)
    • gen = ( x for x in list('abcd') )
    • >>> gen.next()
    • 'a'
  • { set comprehensison }
    • sc = { x for x in 'abcdaabbccdd}
    • set(['a', 'c', 'b', 'd'])
  • { dictionary comprehension }
    • dc ={ x: x*2 for x in 'abcd'}
    • {'a': 'aa', 'c': 'cc', 'b': 'bb', 'd': 'dd'}

As a list comprehension expression you still have to use a code block since the field calculator doesn't support iterable objects etc.  You can also use a generator like in the second example.

Since a list comprehension returns a list, you need to append a slice notation ...  [0] ... to indicate that the first element in the list is returned.   Obviously, this isn't an issue since the data are process one row at a time and there will only be one value returned.  You can use a generator expression if you wish to skip the slicing notation.  Remember, square brackets for list comprehensions, round brackets for a generator  and curly brackets for set and dictionary comprehensions.

Now this thread https://community.esri.com/thread/171857   posed the interesting question of what to do when you have been consciensious and used multiple null values to represent several conditions of null.  In the example, -99, -98 and -97 were used.  The desire was to provide a reclassification of the nulls whilst doing "something" with the remaining values in the two fields that needed to be examined.  In the code examples above, it was decided to retain the null value ranges while performing a simple subtraction for non-null values.  The results are shown in the figure.

As a cautionary note, this example only works IF the returned values of the subtraction does not return values in the range -99 to -97.  Solution?? instead of using a base conversion of -98, I could have easily substituted -9998 (or something similar) which would have retained the intent of the nulls yet permitted the subtraction to proceed.

I should be obvious that all you need to do is find a base value for the null should be obvious that all you need to do is find a base value for the null.

List comprehension code blocks

Code block

def LC_func(a,b): 
   return [(a+abs(a-b)) if ((a-b)<=1) else min(a,(a-b))][0]
   

Expression box     LC_func( !Fld_0!, !Fld_1!)

OR

Code block 

def LC_func2(a,b): 
  return ( (a+abs(a-b)) if ((a-b)<=1) else min(a,(a-b)) )

Expression box      LC_func2( !Fld_0!, !Fld_1!)

Tabular output
LC_demo.png

more
4 0 2,373
curtvprice
MVP Alum

The following script gives you the ability to fully customize your python environment.

The example below is set up for ArcGIS 10.2 with ArcGIS background GP and  anaconda 32 and anaconda 64 installed.

You can tweak this to your hearts content to adapt to all your favorite Python environments!

# Startup script to link Anaconda python environment with ArcGIS
#
# 1. Install Anaconda, setup environment to match your ArcGIS version
# 2. Edit the paths below
# 3. Put this startup script in the startup folder
#    Startup folder can be found with: "C:\Python27\ArcGIS10.2\python -m site --user-site"
#    Usually will be:
# C:\Users\%USERNAME%\AppData\Roaming\Python\Python27\site-packages


# edit these paths to match your setup


arcver = "10.2"
# Anaconda home folders
conda32 = r"D:\Users\cprice\Anaconda"
conda64 = r"D:\Users\cprice\Anaconda64"
# here are the conda environments you've set up use with ArcGIS
# arc1022 is the environment setup for ArcGIS
conda_env32 = "{}/envs/{}".format(conda32, "arc1022")
conda_env64 = "{}/envs/{}".format(conda64, "arc1022")


# do not edit below this line


# ArcGIS Python home folders
# i.e. C:\Python27\ArcGIS10.2
arcver = arcver[:4]
arcpy32 = r"C:\Python27\ArcGIS{}".format(arcver)
arcpy64 = r"C:\Python27\ArcGISx64{}".format(arcver)

import sys
import os

try:
  if sys.version.find("64 bit") < 0:
    conda_path = os.path.normpath(conda_env32)
    arcpy_path = os.path.normpath(arcpy32)
    arcpy_pthfile = arcpy_path + "/lib/site-packages/desktop{}.pth".format(arcver)
  else:
    conda_path = os.path.normpath(conda_env64)
    arcpy_path = os.path.normpath(arcpy64)
    arcpy_pthfile = arcpy_path + "/lib/site-packages/DTBGGP64.pth"


  # If running ArcGIS python, add conda modules to path
  if sys.prefix.lower().find("arcgis10") != -1:
    conda_site = os.path.join(conda_path, "lib", "site-packages")
    if not os.path.isdir(conda_site):
      raise Exception()
    sys.path.append(conda_site)
    ##print("added conda paths to arc")

  # if running Anaconda add arcpy to path
  elif sys.prefix.lower() == conda_path.lower():
    with open(arcpy_pthfile, "r") as f:
      sys.path +=  [p.strip() for p in f.readlines()]
    ##print("added arcpy paths to conda")

except Exception as msg:
  print(msg)
  pass

more
2 8 3,093
DanPatterson_Retired
MVP Emeritus

I know ... I know... you can do that in Window's Explorer ...

find stuff ... sort stuff ... check modification date ...

but nothing is nicer than having a hardcopy in your hand to document where things are at a point in time.

To this end, I began poking around in the os module and the arcpy.da module looking at the options for traversing paths.  This lead me to some amazing threads on GeoNet and elsewhere about documenting data sources and types.  Many people went to great lengths to develop the Amazing-o-rama that would document everything ... everywhere and at any time.

Me?  I know where all my data are.

My needs are simple...I have a couple of folders of shapefiles, a couple of folders of grids, some *.tif and *.sid things and more recently...from cajoling by some nameless cohorts on this forum...a real file geodatabase.

I generally don't make maps, I generally don't need real world data and when it comes to most things, a geometry type is a geometry type, maybe with its own personality, but the same-ish is good enough.

What I really need to do is find simple things. Python scripts...text files...*.png images for manuals, *.wpd (yes WordPerfect) and the odd *.doc/docx file.

This led me to the distraction ... simply explore the os module, specifically os.walk which of course led me to arcpy.da.walk and somewhere along the path...the time and datetime modules.  I got sidetracked into Python's Mini Formatting Language, the collections module (again!) and base python object properties and how some properties are not as overtly exposed as others.  The later one, will be the focus of my next post.

So for now...here is my documentation and the script that meets my simple needs.

In the if __name__ == '__main__':  section, you can change the source folder, the file extension you are looking for.  All other 'crap'...whether you want to get creation/modification/some-other-time option have been stripped back.

I have written in a style as noted in the script header.  I could turn it into a toolbox tool...but why...I have emailed it to myself...could use "the cloud", but that sounds pretentious and downright '60's.

"""
Script:    files_in_path.py
Path:      F:\A0_Main\
Author:    Dan.Patterson@carleton.ca

Created:   2015-05-25
Modified:  2015-05-25  (last change date)
Purpose:   To find all scripts with an 'ending' in a path
Notes:
  As verbose as possible. No error checking is intentional, to minimize
  bloat to permit verbosity (If you follow that...good).
References:
Size .......
  PEP 8: "For sequences, (strings, lists, tuples), use the fact that empty
          sequences are false."
- this applies to any sequence that has a len, __len__ method, such as
  collections.Counter, collections.OrderedDict etc
  - see empty_tests.py for more discussion
  >>> objs = [ [],(),{},"",[1],(1),{1:"one"},"1",None,True,1,False,0]
  >>> is_empty = [ not i for i in objs ]  # not used, but useful
  >>> print is_empty .....
  -
 - for NumPy arrays use size:
     x = np.array([[],[]]) ==> len(x) == 2, x.size == 0
Time ......
http://stackoverflow.com/questions/237079/

http://gis.stackexchange.com/questions/48537/how-to-make-a-gis-inventory

"""
import sys
import os
import datetime
def mod_time(folder,case):
    """obtain the modified time for the file and path
       get the file name, the modification time,
       convert to readable time, then format
    """
    f = os.path.join(folder,case)            
    t =os.path.getmtime(f)                    
    out = datetime.datetime.fromtimestamp(t)
    out = '{:%Y_%m_%d %H:%M:%S}'.format(out)
    return out
def os_walk(src_folder,ending="py"):
    """folder, file type ending, return creation time... returns a message
       Now walk the walk, getting the files that have the specified ending
       (ie. *.py) in the folder and its subfolders.
    """
    msg = ("\n.....File path: ..... {}".format(src_folder))
    for folder, subfolders, files in os.walk(src_folder):
        case_files = [ f for f in files if f.endswith(ending)]
        if case_files:
            counter = 0
            msg += ("\nFolder:... {}".format(folder))
            for case in case_files:
                counter+=1
                t = mod_time(folder,case)
                msg += ("\n ({:02d}) {: <25s} {}".format(counter,case,t))
            del case,case_files
    return msg
#----------------------------------------------------------
if __name__ == '__main__':
    """change the folder, ending etc below """
    src_folder =  r"F:\Writing_Projects" #r"F:\\"
    ending = "py"
    get_time = True
    #
    msg = os_walk(src_folder,ending=ending)
    print msg

Oh yes... I forgot... highlight...copy...paste output into Notepad++, or your wordprocessor and print/save from there.  I could write the code, but someone has to have homework.

Enjoy

more
0 0 2,426
DanPatterson_Retired
MVP Emeritus

Inspiration... Re: Calculate field conditional statement using a date formula

Background...

List comprehensions...

List comprehensions... 2

Bisect method

>>> dts = [1,28,29,91,92,182,183,365,366] 
>>> bins = [28,91,182,365,400]
>>> import bisect
>>> a = [ bisect.bisect_left(bins,i) + 1  for i in dts]
result
a = [1, 1, 2, 2, 3, 3, 4, 4, 5]
Numpy method

>>> dts = [1,28,29,91,92,182,183,365,366] 
>>> bins = [28,91,182,365,400]
>>> import numpy as np
>>> b = np.digitize(dts, bins, right=True) + 1
result
list(b) = [1, 1, 2, 2, 3, 3, 4, 4, 5]

It really bugs me that doing reclassification is such a pain whether you are using plain python or other software.  This situation is only amplified if you are trying to do this in the field calculator.

Things can be made much easier using either numpy or the bisect module. .. whether it be text or numbers.  The table to the right shows how to do this with the same data set using the bisect and the numpy modules.

The intent in the cited thread was to reclassify some values to an ordinal/intervale scale depending upon threshold values.  This of course normally leads to a multi-tiered if statement.  A simple list comprehension isn't much either due to the nature of the classification.  To expedite matters, some method of parsing the data into an appropriate class is needed.

The bisect module and its methods, allow for this, should you be able to supply it with a sequential classification scheme.  The data can be fed into the method within a list comprehension to return the values for a list of data all at once.  If this method is to be used in the field calculator, a generator expression needs to be used since the field calculator and cursors in general, read their values sequentially from rows in a table.

Reclass methods... numpy and bisect module

Numpy reclass method

np_reclass0.png

Numpy results

np_reclass.png

bisect_reclass.png

The results and the code are shown to the right.

Oh yes... forgot about text...here are two examples, one from text to numbers and the second from text to text (apologies to dog owners)

Homework
bisect_reclass1.png
bisect_reclass2.png

Should you have any other unique situations, please forward them on to me.

more
3 3 2,889
DanPatterson_Retired
MVP Emeritus

List comprehensions (LCs)    Note:  Resurrected from another one of my places.

  • Do you wince when you use map and filter because you can't figure out LCs?
  • Do you rail against modernity and stick with for and if 's?
  • Do you always have to have that smoking one-liner?  Are you ashamed when you can't get it?
  • Did you know that you can put comments in LCs ?
  • Did you know that LCs can be split on to separate lines?  Is that sacrilege?
  • Which do you like better ... form (a), (b) or (c) ?

Basic LC forms

Traditional for loop

c = []
for i in range(20):
    if (i > 5) and (i < 10):
        c.append(i**2)

The smokin' cryptic one-liner

a = [i**2 for i in range(20) if i > 5 and i < 10]

The stacked, commented multi-liner

b = [i**2               # do this
     for i in range(20) # using these
     if (i > 5) and     # where this and
        (i < 10)        # this is good
     ]

Condition checking with numbers and conditions

Again, the three methods, but this time the condition checking is done upfront with an if-else clause.  This query is simply checking to see whether elements in a list are all numbers.  If True, return the number, if False, assign -999.

  nums = [0,1,None,3,None,5,6]   # the list of numbers to check

conventional approach

>>> good = []
>>> for val in nums:
...  if isinstance(val, (int,float)):
...   good.append(val)
...  else:
...   good.append(-999)
...

basic list comprehension

>>> good = [ val if isinstance(val,(int,float)) else -999 for val in nums]
>>> 

stacked list comprehension

>>> good = [
... val         # return the value
... if isinstance(val,(int,float))   # if this is true
... else -999   # return this otherwise
... for val in nums  # for each value in the list
... ]

in all cases

>>> good
[0, 1, -999, 3, -999, 5, 6]

More on condition checking

Some times you want to perform an operation given certain conditions.  In the example below, a check is made to ensure that "b" is not zero to avoid division by zero.  If it is zero, then an alternate value is supplied.

There are two variants of the LC shown and the results are compared as a final step...examine closely.

outer = [1,2]
inner = [2,0,4]
c = [[a, b, a*b, a*b/1.0]  # divide 2 numbers (outer/inner)
     if b                  # if != 0 (0 is a boolean False)
     else [a,b,a*b,"N/A"]  # if equal to zero, do this
     for a in outer        # for each value in the outer list
     for b in inner        # for each value in the inner list
     ]
for val in c:
    # val[0],val[1],val[2]))
    print("a({}), b({}), a*b({}) a/b({})".format(*val )) 

d = [[[a,b,a*b,"N/A"],           # do if False
      [a,b,a*b,a*b/1.0]][b!=0]   # do if True ... then slice
     for a in outer
     for b in inner
     ]
print("d == c??? {}".format(d==c))

a(1), b(2), a*b(2) a/b(2.0) 
a(1), b(0), a*b(0) a/b(N/A) 
a(1), b(4), a*b(4) a/b(4.0) 
a(2), b(2), a*b(4) a/b(4.0) 
a(2), b(0), a*b(0) a/b(N/A) 
a(2), b(4), a*b(8) a/b(8.0) 

d == c??? True

The examples shown here are just the beginning.  The reader should be aware that there are dictionary and set comprehensions as well.  This topic will also serve as a gentle introduction to generators.

If you would like to see other examples, please send me a note.

Additions

What is the difference between a list comprehension and a set comprehension.  Basically, a process step is the only thing sometimes.  As seen below, the set_comp results are completed during the query, whereas the list_comp results in line 04 still need to have a set() operation applied after the fact, should a set truly be required.

>>> a = [1,2,1,1,2,3,3,2,1,1,1,2,3,4,9,8,1,7,7,7,7,7]
>>> list_comp = [ i for i in a if i >3 ]
>>> set_comp = { i for i in a if i > 3 }
>>> list_comp
[4, 9, 8, 7, 7, 7, 7, 7]
>>> set_comp
set([8, 9, 4, 7])

>>> a = [5,6,5,6,5,6,5,6,5,6,4,4] 
>>> b = [5,6] 
>>> lc = [ x*y for x in a for y in b] 
>>> sc = { x*y for x in a for y in b } 
>>> lc [25, 30, 30, 36, 25, 30, 30, 36, 25, 30, 30, 
        36, 25, 30, 30, 36, 25, 30, 30, 36, 20, 24, 20, 24] 
>>> sc {24, 25, 36, 20, 30} 
>>>

more
4 4 2,870
DanPatterson_Retired
MVP Emeritus

---- Topic: Query and summarize revisited ----

This post https://community.esri.com/thread/171629 brought up the sometimes difficult task of finding information within tabular data.  As usual, there are many ways to do the same thing... this is but one.

Pre-tasks: Use TableToNumPyArray to convert your tabular information into an array.  It is pretty fool-proof

TableToNumPyArray—Help | ArcGIS for Desktop

The trick to remember is that each record in an array is treated as unique unless found otherwise.  This makes it particularly easy to sort records on multiple columns and summarize and/or extract what you need from within.  In this example, I concentrated on printing.

I used a simple list comprehension to show how to pull the records out according to the first column which we can treat as a class field.  Once the classes were determined the list comprehension grouped the data into the classes and each class could have further information extracted/summarized.

The output produced can be expanded upon.  Should want to partition the data into separate datasets, you can do it while it is into array form. Hope you get some ideas on breaking down problems into its constituent parts and using the tools available to you.  For ArcGIS Pro, the new python suite, contains Pandas which provides an other alternate to the same process.

I am also sure that someone can come up with an sql statement that does some...but not all of the tasks outlined here.

Summarizing data

>>> # ---- simplify a step or two ----
>>> # - The data... just in array format
>>> 
>>> arr_data = [('a', 50, 4), ('c', 20, 1),
                ('a', 15, 5), ('e', 40, 4), 
                ('a', 35, 2),('b', 100, 5),
                ('c', 80, 3), ('d', 100, 3), ('e', 60, 2)]
>>> dt =[('col_0', np.str, 5),
         ('col_1','<i4'),
         ('col_2','<i4')]
>>> a = np.array(arr_data, dtype=dt)
>>> a.reshape((-1,1))
array([[('a', 50, 4)],
       [('c', 20, 1)],
       [('a', 15, 5)],
       [('e', 40, 4)],
       [('a', 35, 2)],
       [('b', 100, 5)],
       [('c', 80, 3)],
       [('d', 100, 3)],
       [('e', 60, 2)]], 
      dtype=[('col_0', '<U5'),... ])
>>>
>>> uni_info = np.unique(a, True, True)
>>> vals, idx, inv = uni_info  
>>> # vals is the data,
>>> # the others are for reshaping the array    
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Now for some output

>>> vals.reshape((-1,1))
array([[('a', 15, 5)],
       [('a', 35, 2)],
       [('a', 50, 4)],
       [('b', 100, 5)],
       [('c', 20, 1)],
       [('c', 80, 3)],
       [('d', 100, 3)],
       [('e', 40, 4)],
       [('e', 60, 2)]], 
      dtype=[('col_0', ... ])
>>> # returns array(['a', 'b', 'c', 'd', 'e'],dtype='<U5') 
>>> uni = np.unique(vals['col_0'])      
>>> subs = [ vals[vals['col_0']==i] for i in uni ]
>>>
>>> for sub in subs:
...  n = len(sub['col_0'])
...  t = sub['col_0'][0]
...  val_max = np.max(sub['col_1'])
...  val_min = np.min(sub['col_1'])
...  frmt = "type {} -N: {} -max: {} -min: {}\n  -sub {}"
...  print(frmt.format(t, n, val_max, val_min, sub))
...  
type a -N: 3 -max: 50 -min: 15
  -sub [('a', 15, 5) ('a', 35, 2) ('a', 50, 4)]
type b -N: 1 -max: 100 -min: 100
  -sub [('b', 100, 5)]
type c -N: 2 -max: 80 -min: 20
  -sub [('c', 20, 1) ('c', 80, 3)]
type d -N: 1 -max: 100 -min: 100
  -sub [('d', 100, 3)]
type e -N: 2 -max: 60 -min: 40
  -sub [('e', 40, 4) ('e', 60, 2)]
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

NOTE:

A pdf version was added.  It contains more commentary on the process.  A second example was added on 2015-02-18

more
1 1 2,569
curtvprice
MVP Alum

Sometimes the Python environment can get broken so ArcGIS will not work right; import arcpy fails, or tools will not open. Not an uncommon problem. A full install of ArcGIS includes both x32 (ArcMap) and x64 (background GP) Python 2.x installations, and ArcGIS Pro if installed adds a third one (x64 Python 3.4).

The Python environment often is broken when a non-ArcGIS installation of Python has been installed and the system paths are altered. Although there are ways to manage multiple Python environments, the Windows file associations, etc must be kept clean for ArcGIS to access Python directly. If you do install a 3rd part Python, say, Anaconda, be sure NOT to check the boxes to modify the default Windows PATH environment and file associations.

If your Python setup is not working, and you having trouble resolving the issues, here is a workflow that should get ArcGIS Python functions working again:

1. Close ArcGIS

2. Make sure your PYTHONPATH is blank and PATH environment is clear of Python references (setting env variables)

3. Uninstall any Python and additions (PythonWin etc) you see in Add/Remove Programs

4. Make sure your ArcGIS installer media is available  (DVD or folder with the installer Setup.exe / Setup.msi files)

5. Rename Python folders on C drive

For example: rename C:\Python27 to C:\Python27_old) (renaming is just in case you have some files you want to copy over to the fixed python -- experts only!)

6. Repair Install ArcGIS

7. Repair Install ArcGIS 64-bit Background Processing

8. Repair install other ArcGIS software (Interoperability Extenson, Production Mapping, etc.)

9. If you want to download and install PythonWin matching your version of Python. (It's probably best to install x32 PythonWin unless you have a special need for x64.)

10. Reset your ArcGIS application profile

Completely uninstalling and reinstalling ArcGIS is major Windows surgery and should be a last resort.

more
6 2 3,608
DanPatterson_Retired
MVP Emeritus

Introduction

This blog post is for the two other people in the world that might be remotely interested in rasters/arrays and dimensionality as it can be applied to reshaping for subsequent analysis.  The topic was introduced in my previous blog about block statistics.  I am gently warming the readers (both of you) to moving/sliding window functions and how they work on a mechanical level.

The code below is for the record.  This is one incarnation of what is out there on the web.  Most of the comments and posts deal with the 'speed' and efficacy of determining what you will see.  In my opinion, those discussions don't matter... I have only needed primes and combinations and permutations once outside of an academic environment.  I will rely on this code should the need arise again.  So here it is with an output example.

Nerd Note:

As of worlds largest prime as of this date,   , the worlds largest prime is ... 274207281-1 .... or 22,338,618 digits the number is called M74207281

which is the 49th Mersenne Prime discovered.     http://www.mersenne.org/primes/?press=M74207281

In the next post, I will use this foundation to provide more background on working with block and moving window operations.

"""
Script:  primes_combos_demo.py
Author:  
Dan.Patterson@carleton.ca

Purpose: Long forgotten, but it is done.
"""
import numpy as np
import itertools

def primes(n):
    '''Returns a array of primes, p < n 
    http://stackoverflow.com/questions/2068372/fastest-way-to-list-
    all-primes-below-n-in-python/3035188#3035188
    '''
    assert n>=2
    sieve = np.ones(n/2, dtype=np.bool)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    p_num = np.r_[2, 2*np.nonzero(sieve)[0][1::]+1]
    return p_num


def divisor(n, prn=False):
    """Determine the divisors, products, cumulative products and
       divisors given a set of primes < n
    """
    p_vals = primes(n)
    divs = [p for p in p_vals if (n % p == 0)]
    cump = np.cumproduct(divs).tolist()
    divs = divs + cump
    prods = list(set(i*j for i in divs for j in divs))
    all_div = list(set(divs+prods))
    all_div.sort()
    if prn:
        print("all",all_div)
    return all_div 

def perm_sums(n,max_dim=3):
    """Determine the permutations that sum to n for an array of given size.
       A verbose workflow follows...   """
    ##divs, subs, prods = divisor(n)
    perms = divisor(n)
    p =[]
    for i in range(1,max_dim+1):
        p.append(list(itertools.product(perms, repeat=i)))
    combos = []
    for i in range(len(p)):
        vals = [v for v in p[i] if np.product(v)==n ]
        combos.extend(vals) #.append(vals)
    perms = np.array(perms)
    ppp = np.product(perms,axis=-1)   
    wh = np.where(ppp == n)
    perm_vals = perms[wh]
    return combos

def demo(n=36, prn=True):
    """ demo perm_sums"""
    p = primes(n)
    print("\nPrimes:...{}".format(p))
    all_div = divisor(n)
    print("Number:...{}\nAll:...{}".format(n, all_div))
    ndim = 4
    combos = perm_sums(n, max_dim=ndim)
    if prn:
        print("\nProducts for... {} - max_dim = {}".format(n, ndim))
        for i in combos:
            print(i)
    return combos  

if __name__=="__main__":
    """primes demo"""
    n = 36
    combos = demo(n, prn=True)

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

For the example in my last post for a 6x6 array for 36 unique numbers, these list the possible ways that that data can be reconfigure.

Primes:...[ 2  3  5  7 11 13 17 19 23 29 31]
Number:...36
All:...[2, 3, 4, 6, 9, 12, 18, 36]

Products for... 36 - max_dim = 4

2D3D4D

(36,)

(2, 18)

(3, 12)

(4, 9)

(6, 6)

(9, 4)

(12, 3)

(18, 2)

(2, 2, 9)

(2, 3, 6)

(2, 6, 3)

(2, 9, 2)

(3, 2, 6)

(3, 3, 4)

(3, 4, 3)

(3, 6, 2)

(4, 3, 3)

(6, 2, 3)

(6, 3, 2)

(9, 2, 2)

(2, 2, 3, 3)

(2, 3, 2, 3)

(2, 3, 3, 2)

(3, 2, 2, 3)

(3, 2, 3, 2)

(3, 3, 2, 2)

If you specify a minimum window size you can get possible combinations.

Minimum window...[3, 3]

[(3, 12), (4, 9), (6, 6), (9, 4), (12, 3), (2, 3, 6), (2, 6, 3), (3, 3, 4), (3, 4, 3), (4, 3, 3), (2, 2, 3, 3)]

That's all for now.  The cases of being able to reshape a raster in 1, 2, 3 or 4 dimensions are given above.

more
0 2 2,204
DanPatterson_Retired
MVP Emeritus

Block functions in rasters ...

Modified Jan 17,2015

The background..Results...

To begin, we will construct a simple array consisting of 36 numbers from 0 and 35.

The array is broken down into 3x3 blocks of numbers as can be seen in the output array sample.

Block functions are not used as often as they could be in GIS.  This type of function moves or "jumps" by steps in the X and Y direction specified by the block size.

Moving or sliding window functions are more common in GIS.  The window slides by one cell in the X and Y direction regardless of the shape, size and configuration of the window.

The block functions will be demonstrated in this post since they are simpler to comprehend.  The guiding principles for both types of functions are the same, differing only in their movement characteristics.

The input raster to the right, is being represented in array format so that one can see the values that each "cell" represents.  Each cell is of equal dimensions, varying only in the value that it possesses.  When using block function, the raster is tiled into subarrays/rasters using a "window".  In this example, the simple 3x3 window will be used.  Since the raster is to be broken down by this window in a jumping fashion, we end up... nicely ... nicely with 4 sub rasters each 3x3 in size.  Each sub-raster/array consists of 9 values.  It should be noted, that they could contain nodata values which should not be used in calculations.  Currently all sub-rasters all contain valid integer values

The values for the statistics shown represent the values for each array sub-block.  For instance, the minimum has values of 0, 3, 18, and 21.  If you examine the original array, or the samples sub-raster/arrays, you will see these values are indeed correct.  The difference is that each number now represents the 9 cells making up the 3x3 block.

The statistical parameters were derived using numpy functions and are shown below without explanation (for now).  it is suffice to say, that the input array represents dimension 0 and 1 and each sample array represents dimension 2 and 3, ergo, the values are calculated on the last 2 dimensions so that they represent each block.

The formulae for the code-ophiles are as follows:

    a_min = a.min(axis=(2,3))           a_max = a.max(axis=(2,3))

    a_mean = a.mean(axis=(2,3))     a_sum = a.sum(axis=(2,3))

    a_std = a.std(axis=(2,3))             a_var = a.var(axis=(2,3))

    a_ptp = a_max - a_min

There are the equivalents for the above for rasters/arrays that contain nodata values (ie. np.nanmin, np.nanmax, np.nanmean, np.nansum, np.nanstd, np.nanvar).

Now, if we take the results for the minimum, we can expand it back to its original form by expanding is both the X and Y directions.

    expanded = arr_expand(a_min,3,3)

There are the minor omissions of dealing with nodata values and input arrays that aren't evenly divisable by the window size...but they are minor details and will be covered elsewhere.

The observer should also note, that the array can be exported to a comma, space or tab delimited file  and given an ascii format header to give it real world coordinates for use/import into arcmap.  The whole process can also be accompanied with useage of RasterToNumPy array for export to NumPy and NumPyArrayToRaster if desired.  The subarray types can also be unraveled and presented in tabular format, concatenated and brought back into arcmap for use in graphing or in processes involving other arrays.

Input array...

[[ 0  1  2  3  4  5]

[ 6  7  8  9 10 11]

[12 13 14 15 16 17]

[18 19 20 21 22 23]

[24 25 26 27 28 29]

[30 31 32 33 34 35]]

Output array sample...

[[[[ 0  1  2]

   [ 6  7  8]

   [12 13 14]]

  [[ 3  4  5]

   [ 9 10 11]

   [15 16 17]]]

[[[18 19 20]

   [24 25 26]

   [30 31 32]]

  [[21 22 23]

   [27 28 29]

   [33 34 35]]]]

Results...............

Minimum...a_min

[[ 0  3]

[18 21]]

Maximum...a_max

[[14 17]

[32 35]]

Mean...a_mean

[[  7.  10.]

[ 25.  28.]]

Sum...a_sum

[[ 63  90]

[225 252]]

Std...a_std

[[ 4.97  4.97]

[ 4.97  4.97]]

Var...a_var

[[ 24.67  24.67]

[ 24.67  24.67]]

Range...a_ptp

[[14 14]

[14 14]]

expanded array...expanded

[[ 0  0  0  3  3  3]

[ 0  0  0  3  3  3]

[ 0  0  0  3  3  3]

[18 18 18 21 21 21]

[18 18 18 21 21 21]

[18 18 18 21 21 21]]

Summary: .more examples will be given in the upcoming weeks...

more
0 4 2,356
229 Subscribers
Labels