Block functions in rasters ...
Modified Jan 17,2015
The background.. | Results... |
---|---|
To begin, we will construct a simple array consisting of 36 numbers from 0 and 35. The array is broken down into 3x3 blocks of numbers as can be seen in the output array sample.
Block functions are not used as often as they could be in GIS. This type of function moves or "jumps" by steps in the X and Y direction specified by the block size.
Moving or sliding window functions are more common in GIS. The window slides by one cell in the X and Y direction regardless of the shape, size and configuration of the window.
The block functions will be demonstrated in this post since they are simpler to comprehend. The guiding principles for both types of functions are the same, differing only in their movement characteristics.
The input raster to the right, is being represented in array format so that one can see the values that each "cell" represents. Each cell is of equal dimensions, varying only in the value that it possesses. When using block function, the raster is tiled into subarrays/rasters using a "window". In this example, the simple 3x3 window will be used. Since the raster is to be broken down by this window in a jumping fashion, we end up... nicely ... nicely with 4 sub rasters each 3x3 in size. Each sub-raster/array consists of 9 values. It should be noted, that they could contain nodata values which should not be used in calculations. Currently all sub-rasters all contain valid integer values
The values for the statistics shown represent the values for each array sub-block. For instance, the minimum has values of 0, 3, 18, and 21. If you examine the original array, or the samples sub-raster/arrays, you will see these values are indeed correct. The difference is that each number now represents the 9 cells making up the 3x3 block.
The statistical parameters were derived using numpy functions and are shown below without explanation (for now). it is suffice to say, that the input array represents dimension 0 and 1 and each sample array represents dimension 2 and 3, ergo, the values are calculated on the last 2 dimensions so that they represent each block.
The formulae for the code-ophiles are as follows: a_min = a.min(axis=(2,3)) a_max = a.max(axis=(2,3)) a_mean = a.mean(axis=(2,3)) a_sum = a.sum(axis=(2,3)) a_std = a.std(axis=(2,3)) a_var = a.var(axis=(2,3)) a_ptp = a_max - a_min
There are the equivalents for the above for rasters/arrays that contain nodata values (ie. np.nanmin, np.nanmax, np.nanmean, np.nansum, np.nanstd, np.nanvar).
Now, if we take the results for the minimum, we can expand it back to its original form by expanding is both the X and Y directions.
expanded = arr_expand(a_min,3,3)
There are the minor omissions of dealing with nodata values and input arrays that aren't evenly divisable by the window size...but they are minor details and will be covered elsewhere.
The observer should also note, that the array can be exported to a comma, space or tab delimited file and given an ascii format header to give it real world coordinates for use/import into arcmap. The whole process can also be accompanied with useage of RasterToNumPy array for export to NumPy and NumPyArrayToRaster if desired. The subarray types can also be unraveled and presented in tabular format, concatenated and brought back into arcmap for use in graphing or in processes involving other arrays. |
Input array... [[ 0 1 2 3 4 5] [ 6 7 8 9 10 11] [12 13 14 15 16 17] [18 19 20 21 22 23] [24 25 26 27 28 29] [30 31 32 33 34 35]]
Output array sample... [[[[ 0 1 2] [ 6 7 8] [12 13 14]]
[[ 3 4 5] [ 9 10 11] [15 16 17]]]
[[[18 19 20] [24 25 26] [30 31 32]]
[[21 22 23] [27 28 29] [33 34 35]]]]
Results............... Minimum...a_min [[ 0 3] [18 21]]
Maximum...a_max [[14 17] [32 35]]
Mean...a_mean [[ 7. 10.] [ 25. 28.]]
Sum...a_sum [[ 63 90] [225 252]]
Std...a_std [[ 4.97 4.97] [ 4.97 4.97]]
Var...a_var [[ 24.67 24.67] [ 24.67 24.67]]
Range...a_ptp [[14 14] [14 14]]
expanded array...expanded [[ 0 0 0 3 3 3] [ 0 0 0 3 3 3] [ 0 0 0 3 3 3] [18 18 18 21 21 21] [18 18 18 21 21 21] [18 18 18 21 21 21]] |
Summary: .more examples will be given in the upcoming weeks... |
Neat. Thanks for sharing these.
The main use I have had for block functions used for is aggregation, before the Aggregate() tool was added to Grid (Arc 6.1 or something). Aggregate is of course more efficient (less data to write out!)
Request for a future article: It would be cool to see an implementation of Aggregate in numpy.