Skip navigation
All People > Dan_Patterson > Py... blog > 2018 > February
2018

The original Reclassify Rasters Simply... 

 

Interesting if you have a function that would be a pain to implement in the spatial analyst since it would take a number of steps... ufunc ... Example follows.  Multiple output formats possible

 

 

ufunc .... python user functions with numpy... np.frompyfunc

a = np.arange(1, 101).reshape(10, 10)  # ---- make a raster

a
array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

# ---- a weird request to reclass the data, expressed as a func ----

def reclass_2(a):
    '''reclassification using ufunc user functions'''
    if a > 80 and a != 91:             # ---- the first reclass option
        a = 8
    elif a > 60 and a <= 80:           # ---- another simple ont
        a = 7
    elif (a <= 60) and (a % 2 == 0):   # ---- now this one is a kicker
        a = 2
    else:                              # ---- I could continue, but I won't
        a = 0
    return a


# ---- Now... make the ufunc using numpy's frompyfunc
# frompyfunc(name of the def, number of input parameters in, results out)

r = np.frompyfunc(reclass_2, 1, 1)     # ---- function made

r                                      # ---- not much info
<ufunc '? (vectorized)'>

# ---- Make it do some work by passing the array

val =r(a)                              # ---- the array, 1 input, 1 output

array([[0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
       [8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
       [0, 8, 8, 8, 8, 8, 8, 8, 8, 8]], dtype=object)

# ---- return as integer
val.astype(np.int32)
Out[8]:
array([[0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [0, 2, 0, 2, 0, 2, 0, 2, 0, 2],
       [7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
       [8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
       [0, 8, 8, 8, 8, 8, 8, 8, 8, 8]])

# ---- how about float?
val.astype(np.float32)

array([[ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.,  0.,  2.],
       [ 7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.],
       [ 8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.],
       [ 0.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.]], dtype=float32)

# ---- be the first on your block to have a string raster

val.astype(np.str)
Out[10]:
array([['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['0', '2', '0', '2', '0', '2', '0', '2', '0', '2'],
       ['7', '7', '7', '7', '7', '7', '7', '7', '7', '7'],
       ['7', '7', '7', '7', '7', '7', '7', '7', '7', '7'],
       ['8', '8', '8', '8', '8', '8', '8', '8', '8', '8'],
       ['0', '8', '8', '8', '8', '8', '8', '8', '8', '8']],
      dtype='<U1')

 

If you have any interesting examples that you try, pass them on

It isn't quite false advertising. 

I will attribute it to an omission based on the needs of the many rather than on the needs of the few.

 

The story is in code...

# ---- start with an array ------

# ---- any data, floats for a tiny place over 5 days -----

array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 36.,  37.,  38.,  39.],
        [ 40.,  41.,  42.,  43.],
        [ 44.,  45.,  46.,  47.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]]])

# ---- save to a raster *.tif ----------------

r = arcpy.NumPyArrayToRaster(a)
r.save("c:/temp/np2rast.tif")

# ---- load it back in to check your work ----

b = arcpy.RasterToNumPyArray("c:/temp/np2rast.tif")

# ---- seems to only read the first 'day' ---
array([[  0.,   1.,   2.,   3.],
       [  4.,   5.,   6.,   7.],
       [  8.,   9.,  10.,  11.]])

# ---- well that isn't good, hmmmm  ---------------
# The
rasters = []
for i in range(a.shape[0]):
    ai = a[i]
    rast = arcpy.NumPyArrayToRaster(ai)
    r_name = "in_memory/a{:0>3}.tif".format(i)
    rasters.append(r_name)
    rast.save(r_name)
rasters = ";".join([i for i in rasters])

rasters
'in_memory/a000.tif;in_memory/a001.tif;in_memory/a002.tif;in_memory/a003.tif;in_memory/a004.tif'

fname = "c:/temp/np2rast_comp.tif"
arcpy.management.CompositeBands(rasters, fname)  # ---- let's give Arc* another chance
c = arcpy.RasterToNumPyArray(fname)

# ---- everything is back as it should be -----------

array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 36.,  37.,  38.,  39.],
        [ 40.,  41.,  42.,  43.],
        [ 44.,  45.,  46.,  47.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]]])

 

Now if you followed so far, I have made reference to working with numpy arrays to do a lot of work.  Rarely do I have the need to bring a 'raster' (ie *.tif) back in to ArcGIS PRO or ArcMap.  I did and discovered, that arcpy.RasterToNumPyArray seems to only like bringing back  a limited amount of data, probably with the expectation that user wants to map floating point values (ie 1 band) or integer values ( ie 3 bands for a 'picture'/'map').

 

This is a work-around, built-in.

 

You can of course store anything numeric in a tiff.  How about some statistics for our array 'a'.

 

mins = np.min(a, axis=0)
maxs = np.max(a, axis=0)
avg = np.mean(a, axis=0)
rang = np.ptp(a, axis=0)
medn = np.median(a, axis=0)

stats = [mins, maxs, avg, rang, medn]
rasters = []
for i in range(5):
    ai = stats[i]
    rast = arcpy.NumPyArrayToRaster(ai)
    r_name = "in_memory/a{:0>3}.tif".format(i)
    rasters.append(r_name)
    rast.save(r_name)
rasters = ";".join([i for i in rasters])

arcpy.management.CompositeBands(rasters, "c:/temp/stats.tif")
<Result 'c:\\Temp\\stats.tif'>

s = arcpy.RasterToNumPyArray("c:/temp/stats.tif")

# ---- in order, min, max, avg, range, median ----
array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 48.,  48.,  48.,  48.],
        [ 48.,  48.,  48.,  48.],
        [ 48.,  48.,  48.,  48.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]]])

# ---- It won't make for a nice map, but your data is there

Next up, explorations of the limitations of working with *.tif files for multidimensional raster/array data.  

There are a number of options within the Esri suite of software, some of it relies in part on open source offerings.  Some of these (ie libtiff and tifffile) have even made it into open source software too (ie GDAL, scipy.ndimage and skimage etc etc), kindof like open source-open source.

 

Enough for now.

Table Tools...

 

 

This is a somewhat unorganized pictoral supplement to the tools contained there.

New additions

  • Added nested means classification for classification of numeric data (2019-05-02)
  • Added text file output to Crosstabulate and moved it to the Analysis menu (2019-02-23)
  • Table to csv and table to text (2018-12-29)
  • Statistics (2018-08-28)
    • Running statistics added to the Statistics toolset (ie running means etc)
  • Field calculator  (2018-08-28)
    • Poly* field calculations :  smallest/largest angle, min, max, avg length/perimeter
  • Field calculator
    • Calculations on numeric fields, includes a load of 'difference from .... ' type options, z-scores for fields etc etc
    • Calculations for text fields..  currently the ability to identify text data in a field and provide a sequence number for it   ie   A  A B A B A   becomes  A_00  A_01  B_00  A_02  B_01  A_03
    • Calculations for geometry properties for  polygon/polyline featureclasses including max_length, min_length, max_angle etc etc.
  • Find sequences under Analysis

 

General Toolbox Layout

These images show the contents so far

 

 

 

:----------------------------------------------------------------------------------------------------------------------

Analysis Toolset

Frequency

Longed for the functionality of this tool with a Standard license?  Here you go.  

You can get the frequency counts by field pairing and a 'sum' on one or more fields.

 

With a sample output from the inputs.

 

 

Cross-Tabulation

The cross-tabulation tools does just that.  Take two fields and get a cross-tabulation of their values.  I will add multidimensional crosstabulation at some point when/if a demand surfaces.

 

:----------------------------------------------------------------------------------------------------------------------

Array_tools Toolset

 

A handy conversion set, enabling the user to work with numpy arrays.

 

This is pretty self-explanatory.  The reason that there isn't an 'All' option is that it only takes fewer lines of code to provide that option than to check to see whether people want All fields or selected fields exported to an *.npy file. 

 

You can always pull out the line of code if you want all exported, otherwise, select the fields that you want exported and in the order that you want.  

 

When the Shape field is exported, the centroid (regardless of geometry type) is used so that you can retrieve the X and Y coordinates when and if you import the data.

 

To reverse the process, simply use the Numpy Array to Table tool.

The basic setup and the resultant are shown below.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

:----------------------------------------------------------------------------------------------------------------------

Field Calculator Toolset

 

Some of the optional tools in the field calculator toolset.

 

:----------------------------------------------------------------------------------------------------------------------

Statistics Toolset

 

Column Statistics

The tool allows you to specify the fields to summarize. Fields of 'Double' type are supported since you shouldn't be calculating statistics for integer fields... Do the conversion if you insist.

 

Here is a summary sample for 3 fields. Null values are accounted for in the calculation, so there is no need to exclude them prior to doing calculations.  

As you can see, the traditional statistics are calculated and they include Skewness and Kurtosis which the current Summary Statistics in PRO tools don't.  I will add the Case option when I feel the output looks nice .

 

 

I even got fancy and included an alternate for displaying the data in the 'View Details' window.  As it says... simply select all and copy and paste to your desired text editor.

 

 

 

Rank by Fields

 

Just does that... consider the following information in a 10K table on mass and size.  You need a quick ranking, give it a whirl.  You can not only get the biggest, biggests/smallest X, you can construct an ogive and derive other stats.

 

Here is a sample run and output.


 

 


 

:----------------------------------------------------------------------------------------------------------------------

This comes up time and again. ....I have 'X' rasters for an area and I need to determine Y over time....

Here is an example.... ....calculating percentiles....

 

Cell Statistics... is the obvious choice.  But you don't have the Spatial Analyst extension? Doesn't matter.  All the tools that you need are already provided if you are using current versions of ArcMap or ArcGIS PRO.

 

Do some work

 

(1)  Reading the raster files

import numpy as np
import arcpy

arrs = []
folder = r"C:\Data\rasters"  # ---- specify a folder containing all the rasters ----
arcpy.env.workspace = folder
rasters = arcpy.ListRasters("*", "TIF")
for raster in rasters:
    arrs.append(arcpy.RasterToNumPyArray(raster))

a = np.array(arrs) # ---- the master array ----

 

Once the rasters are all converted to arrays, they were appended to a list.  A master array was created in line 9.

As an example, 31 rasters in tif format were read in.  These are small rasters, but they can be as large as you can process.  You are now ready to calculate values for them using numpy.

 

(2)  Getting the statistics

median = np.median(a, axis=0) # ---- pick a statistic ----

array([[ 59.,  54.,  36., ...,  57.,  46.,  46.],
       [ 43.,  45.,  59., ...,  38.,  51.,  51.],
       [ 35.,  50.,  57., ...,  47.,  55.,  65.],
       ...,
       [ 43.,  52.,  40., ...,  56.,  62.,  60.],
       [ 45.,  57.,  39., ...,  45.,  44.,  48.],
       [ 49.,  57.,  50., ...,  56.,  50.,  58.]]

mins = np.min(a, axis=0)

array([[1, 3, 0, ..., 6, 4, 5],
       [1, 5, 6, ..., 6, 7, 2],
       [3, 2, 4, ..., 2, 5, 4],
       ...,
       [4, 0, 1, ..., 3, 4, 6],
       [0, 0, 0, ..., 5, 1, 0],
       [4, 1, 1, ..., 0, 3, 7]])

# ---- name a statistic, with or without nodata values, it can be done ----

 

Now when you are done your calculations you will probably want to make a raster so you can bask in the beauty of your efforts.

 

(3)  Reading the raster information and saving the result

rast = arcpy.Raster(r01)  # ---- read the first raster we loaded ----

rast = arcpy.Raster(r01)  # ---- simple

dir(rast)                 # ---- get raster information... some snipping here ----

[..., 'bandCount', 'catalogPath', 'compressionType', 'extent', 'format', 'hasRAT',
'height', 'isInteger', 'isTemporary', 'maximum', 'mean', 'meanCellHeight',
'meanCellWidth', 'minimum', 'name', 'noDataValue', 'path', 'pixelType', 'save',
'spatialReference', 'standardDeviation', 'uncompressedSize', 'width']

# ---- Save the result out to a new raster ------

r01 = rasters[1]                    # --- rasters have the same extent and cell size
rast = arcpy.Raster(r01)

lower_left = rast.extent.lowerLeft  # ---- needed to produce output
cell_size = rast.meanCellHeight     # ---- we will use this for x and y

f = r"c:\temp\result.tif"
r = arcpy.NumPyArrayToRaster(a, lower_left, cell_size, cell_size)
r.save(f)

 

(4) Saving and reloading the arrays

Now, the nice thing about arrays is that you can save a 3D to a file for reloading later (or any dimension for that matter).

If you have a 3D array like we have, this is kind of like a big 'zip'.  

np.save(r"c:\temp\myarray.npy", a)  # ---- 'a' is the multi dimensional array

a.shape  # ---- 31 days, 100 x 100 cells
(31, 100, 100)

# ---- reload it

im_back = np.load(r"c:\temp\myarray.npy")

im_back.shape
(31, 100, 100)

# ---- check to see if it is good

np.all(a == im_back)  # ---- some numpy magic ;)
True

 

That's about all.  Not covered here is nodata cells (easy, you can assign one) and statistics for nodata (easy, every stat has a nodata equivalent).

 

Try your skills with the attached *.npy file.

The link to the ArcGIS Code Sharing site... ... pictures are worth more than words. 

The toolset is easy to use.  There should be some expectation that your point sets can be enclosed with an ellipse or some other container.  

 

For non-degenerate ellipses, you need at least 3 points, since 3 non-collinear points is the minimum that esri's geometry engine will allow you to create polylines and polygons in. 

My Bounding Containers toolset covers other 'containers' to put stuff in.  Clean up your data and put it in its place

 

So here is a pretty picture, showing some simple cases of point groupings and their generated ellipses.

 

This is not the same as the Standard Deviational Ellipse.  That one doesn't enclose the boundary... it is a stats thing.

 

 

Just an example of the details from above

Then you get this... you can see it can't you...

Large data clouds are not problem and it is nice to see an elliptical cloud being delineated as an ellipse

 

Enjoy

Send me emails if you have questions