Skip navigation
All People > Dan_Patterson > Py... blog
1 2 3 4 5 Previous Next

Py... blog

109 posts

It isn't quite false advertising. 

I will attribute it to an omission based on the needs of the many rather than on the needs of the few.

 

The story is in code...

# ---- start with an array ------

# ---- any data, floats for a tiny place over 5 days -----

array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 36.,  37.,  38.,  39.],
        [ 40.,  41.,  42.,  43.],
        [ 44.,  45.,  46.,  47.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]]])

# ---- save to a raster *.tif ----------------

r = arcpy.NumPyArrayToRaster(a)
r.save("c:/temp/np2rast.tif")

# ---- load it back in to check your work ----

b = arcpy.RasterToNumPyArray("c:/temp/np2rast.tif")

# ---- seems to only read the first 'day' ---
array([[  0.,   1.,   2.,   3.],
       [  4.,   5.,   6.,   7.],
       [  8.,   9.,  10.,  11.]])

# ---- well that isn't good, hmmmm  ---------------
# The
rasters = []
for i in range(a.shape[0]):
    ai = a[i]
    rast = arcpy.NumPyArrayToRaster(ai)
    r_name = "in_memory/a{:0>3}.tif".format(i)
    rasters.append(r_name)
    rast.save(r_name)
rasters = ";".join([i for i in rasters])

rasters
'in_memory/a000.tif;in_memory/a001.tif;in_memory/a002.tif;in_memory/a003.tif;in_memory/a004.tif'

fname = "c:/temp/np2rast_comp.tif"
arcpy.management.CompositeBands(rasters, fname)  # ---- let's give Arc* another chance
c = arcpy.RasterToNumPyArray(fname)

# ---- everything is back as it should be -----------

array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 36.,  37.,  38.,  39.],
        [ 40.,  41.,  42.,  43.],
        [ 44.,  45.,  46.,  47.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]]])

 

Now if you followed so far, I have made reference to working with numpy arrays to do a lot of work.  Rarely do I have the need to bring a 'raster' (ie *.tif) back in to ArcGIS PRO or ArcMap.  I did and discovered, that arcpy.RasterToNumPyArray seems to only like bringing back  a limited amount of data, probably with the expectation that user wants to map floating point values (ie 1 band) or integer values ( ie 3 bands for a 'picture'/'map').

 

This is a work-around, built-in.

 

You can of course store anything numeric in a tiff.  How about some statistics for our array 'a'.

 

mins = np.min(a, axis=0)
maxs = np.max(a, axis=0)
avg = np.mean(a, axis=0)
rang = np.ptp(a, axis=0)
medn = np.median(a, axis=0)

stats = [mins, maxs, avg, rang, medn]
rasters = []
for i in range(5):
    ai = stats[i]
    rast = arcpy.NumPyArrayToRaster(ai)
    r_name = "in_memory/a{:0>3}.tif".format(i)
    rasters.append(r_name)
    rast.save(r_name)
rasters = ";".join([i for i in rasters])

arcpy.management.CompositeBands(rasters, "c:/temp/stats.tif")
<Result 'c:\\Temp\\stats.tif'>

s = arcpy.RasterToNumPyArray("c:/temp/stats.tif")

# ---- in order, min, max, avg, range, median ----
array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 48.,  49.,  50.,  51.],
        [ 52.,  53.,  54.,  55.],
        [ 56.,  57.,  58.,  59.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]],

       [[ 48.,  48.,  48.,  48.],
        [ 48.,  48.,  48.,  48.],
        [ 48.,  48.,  48.,  48.]],

       [[ 24.,  25.,  26.,  27.],
        [ 28.,  29.,  30.,  31.],
        [ 32.,  33.,  34.,  35.]]])

# ---- It won't make for a nice map, but your data is there

Next up, explorations of the limitations of working with *.tif files for multidimensional raster/array data.  

There are a number of options within the Esri suite of software, some of it relies in part on open source offerings.  Some of these (ie libtiff and tifffile) have even made it into open source software too (ie GDAL, scipy.ndimage and skimage etc etc), kindof like open source-open source.

 

Enough for now.

Table Tools...

 

 

This is a somewhat unorganized pictoral supplement to the tools contained there.

New additions

  • Added nested means classification for classification of numeric data (2019-05-02)
  • Added text file output to Crosstabulate and moved it to the Analysis menu (2019-02-23)
  • Table to csv and table to text (2018-12-29)
  • Statistics (2018-08-28)
    • Running statistics added to the Statistics toolset (ie running means etc)
  • Field calculator  (2018-08-28)
    • Poly* field calculations :  smallest/largest angle, min, max, avg length/perimeter
  • Field calculator
    • Calculations on numeric fields, includes a load of 'difference from .... ' type options, z-scores for fields etc etc
    • Calculations for text fields..  currently the ability to identify text data in a field and provide a sequence number for it   ie   A  A B A B A   becomes  A_00  A_01  B_00  A_02  B_01  A_03
    • Calculations for geometry properties for  polygon/polyline featureclasses including max_length, min_length, max_angle etc etc.
  • Find sequences under Analysis

 

General Toolbox Layout

These images show the contents so far

 

 

 

:----------------------------------------------------------------------------------------------------------------------

Analysis Toolset

Frequency

Longed for the functionality of this tool with a Standard license?  Here you go.  

You can get the frequency counts by field pairing and a 'sum' on one or more fields.

 

With a sample output from the inputs.

 

 

Cross-Tabulation

The cross-tabulation tools does just that.  Take two fields and get a cross-tabulation of their values.  I will add multidimensional crosstabulation at some point when/if a demand surfaces.

 

:----------------------------------------------------------------------------------------------------------------------

Array_tools Toolset

 

A handy conversion set, enabling the user to work with numpy arrays.

 

This is pretty self-explanatory.  The reason that there isn't an 'All' option is that it only takes fewer lines of code to provide that option than to check to see whether people want All fields or selected fields exported to an *.npy file. 

 

You can always pull out the line of code if you want all exported, otherwise, select the fields that you want exported and in the order that you want.  

 

When the Shape field is exported, the centroid (regardless of geometry type) is used so that you can retrieve the X and Y coordinates when and if you import the data.

 

To reverse the process, simply use the Numpy Array to Table tool.

The basic setup and the resultant are shown below.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

:----------------------------------------------------------------------------------------------------------------------

Field Calculator Toolset

 

Some of the optional tools in the field calculator toolset.

 

:----------------------------------------------------------------------------------------------------------------------

Statistics Toolset

 

Column Statistics

The tool allows you to specify the fields to summarize. Fields of 'Double' type are supported since you shouldn't be calculating statistics for integer fields... Do the conversion if you insist.

 

Here is a summary sample for 3 fields. Null values are accounted for in the calculation, so there is no need to exclude them prior to doing calculations.  

As you can see, the traditional statistics are calculated and they include Skewness and Kurtosis which the current Summary Statistics in PRO tools don't.  I will add the Case option when I feel the output looks nice .

 

 

I even got fancy and included an alternate for displaying the data in the 'View Details' window.  As it says... simply select all and copy and paste to your desired text editor.

 

 

 

Rank by Fields

 

Just does that... consider the following information in a 10K table on mass and size.  You need a quick ranking, give it a whirl.  You can not only get the biggest, biggests/smallest X, you can construct an ogive and derive other stats.

 

Here is a sample run and output.


 

 


 

:----------------------------------------------------------------------------------------------------------------------

This comes up time and again. ....I have 'X' rasters for an area and I need to determine Y over time....

Here is an example.... ....calculating percentiles....

 

Cell Statistics... is the obvious choice.  But you don't have the Spatial Analyst extension? Doesn't matter.  All the tools that you need are already provided if you are using current versions of ArcMap or ArcGIS PRO.

 

Do some work

 

(1)  Reading the raster files

import numpy as np
import arcpy

arrs = []
folder = r"C:\Data\rasters"  # ---- specify a folder containing all the rasters ----
arcpy.env.workspace = folder
rasters = arcpy.ListRasters("*", "TIF")
for raster in rasters:
    arrs.append(arcpy.RasterToNumPyArray(raster))

a = np.array(arrs) # ---- the master array ----

 

Once the rasters are all converted to arrays, they were appended to a list.  A master array was created in line 9.

As an example, 31 rasters in tif format were read in.  These are small rasters, but they can be as large as you can process.  You are now ready to calculate values for them using numpy.

 

(2)  Getting the statistics

median = np.median(a, axis=0) # ---- pick a statistic ----

array([[ 59.,  54.,  36., ...,  57.,  46.,  46.],
       [ 43.,  45.,  59., ...,  38.,  51.,  51.],
       [ 35.,  50.,  57., ...,  47.,  55.,  65.],
       ...,
       [ 43.,  52.,  40., ...,  56.,  62.,  60.],
       [ 45.,  57.,  39., ...,  45.,  44.,  48.],
       [ 49.,  57.,  50., ...,  56.,  50.,  58.]]

mins = np.min(a, axis=0)

array([[1, 3, 0, ..., 6, 4, 5],
       [1, 5, 6, ..., 6, 7, 2],
       [3, 2, 4, ..., 2, 5, 4],
       ...,
       [4, 0, 1, ..., 3, 4, 6],
       [0, 0, 0, ..., 5, 1, 0],
       [4, 1, 1, ..., 0, 3, 7]])

# ---- name a statistic, with or without nodata values, it can be done ----

 

Now when you are done your calculations you will probably want to make a raster so you can bask in the beauty of your efforts.

 

(3)  Reading the raster information and saving the result

rast = arcpy.Raster(r01)  # ---- read the first raster we loaded ----

rast = arcpy.Raster(r01)  # ---- simple

dir(rast)                 # ---- get raster information... some snipping here ----

[..., 'bandCount', 'catalogPath', 'compressionType', 'extent', 'format', 'hasRAT',
'height', 'isInteger', 'isTemporary', 'maximum', 'mean', 'meanCellHeight',
'meanCellWidth', 'minimum', 'name', 'noDataValue', 'path', 'pixelType', 'save',
'spatialReference', 'standardDeviation', 'uncompressedSize', 'width']

# ---- Save the result out to a new raster ------

r01 = rasters[1]                    # --- rasters have the same extent and cell size
rast = arcpy.Raster(r01)

lower_left = rast.extent.lowerLeft  # ---- needed to produce output
cell_size = rast.meanCellHeight     # ---- we will use this for x and y

f = r"c:\temp\result.tif"
r = arcpy.NumPyArrayToRaster(a, lower_left, cell_size, cell_size)
r.save(f)

 

(4) Saving and reloading the arrays

Now, the nice thing about arrays is that you can save a 3D to a file for reloading later (or any dimension for that matter).

If you have a 3D array like we have, this is kind of like a big 'zip'.  

np.save(r"c:\temp\myarray.npy", a)  # ---- 'a' is the multi dimensional array

a.shape  # ---- 31 days, 100 x 100 cells
(31, 100, 100)

# ---- reload it

im_back = np.load(r"c:\temp\myarray.npy")

im_back.shape
(31, 100, 100)

# ---- check to see if it is good

np.all(a == im_back)  # ---- some numpy magic ;)
True

 

That's about all.  Not covered here is nodata cells (easy, you can assign one) and statistics for nodata (easy, every stat has a nodata equivalent).

 

Try your skills with the attached *.npy file.

The link to the ArcGIS Code Sharing site... ... pictures are worth more than words. 

The toolset is easy to use.  There should be some expectation that your point sets can be enclosed with an ellipse or some other container.  

 

For non-degenerate ellipses, you need at least 3 points, since 3 non-collinear points is the minimum that esri's geometry engine will allow you to create polylines and polygons in. 

My Bounding Containers toolset covers other 'containers' to put stuff in.  Clean up your data and put it in its place

 

So here is a pretty picture, showing some simple cases of point groupings and their generated ellipses.

 

This is not the same as the Standard Deviational Ellipse.  That one doesn't enclose the boundary... it is a stats thing.

 

 

Just an example of the details from above

Then you get this... you can see it can't you...

Large data clouds are not problem and it is nice to see an elliptical cloud being delineated as an ellipse

 

Enjoy

Send me emails if you have questions

Start at a point.  You are given a list of bearings and distances.  Where do you end up.

This comes up fairly often and the COGO tools allow you to create them from scratch.  

Here is how you can do it using numpy and arcpy.

 

First a picture... given a location, you are instructed to move 100m in 10 degree increments between points.  You will obviously track a circle.  The traverse point locations can be constructed from base geometry.  The locations can then be saved to a featureclass.  The point featureclass can be converted to a continuous line, which can be segmented if you desire.  Total time, not much.

The image above, shows the results of such a venture.

  • I created my geometry using python/numpy
  • I save the results to a featureclass using Numpyarraytofeatureclass (hence points)
  • From those points I used Points to Line available at all license levels
  • The continuous line was blown apart to segments using Split Lines at Vertices but alas, only available at the Advance level. .... ( I will save this obstruction to another blog, where I will show you how to do it with a standard license   EDIT... From the second incarnation, you can produce a featureclass and use the XY to Line tool which is available at all license levels)
  • Add Geometry Attributes enabled me to obtain the coordinates of the start, middle and end locations of the line.  This could have been done during the construction of the geometry as well.

 

An how many of us have taught a class where this happens....

 

More to come....

Oh yes... code... NOW  the origin needs to be given in projected coordinates (ie UTM, State Plane, Web Mercator etc) since the calculations are done in planar units.  See my Vincenty implementation if you want to work with geodesic calculations.

def dist_bearing(orig=(0, 0), bearings=None, dists=None):
    """point locations given distance and bearing
    :Notes:  for sample calculations
    :-----
    : - bearings = np.arange(0, 361, 10.)
    : - dists = np.random.randint(10, 500, 36) * 1.0
    : - dists = np.ones((len(bearings),))
    :   dists.fill(100.)
    """

    rads = np.deg2rad(bearings)
    sines = np.sin(rads)
    coses = np.cos(rads)
    dx = sines * dists
    dy = coses * dists
    x_n = np.cumsum(dx) + orig[0]
    y_n = np.cumsum(dy) + orig[1]
    data = np.vstack((bearings, dists, dx, dy, x_n, y_n)).T
    frmt = "Origin (0,0)\n" + "{:>10s}"*6
    names = ["bearings", "dist", "dx", "dy", "Xn", "Yn"]
    print(frmt.format(*names))
    for i in data:
        print(("{: 10.2f}"*6).format(*i))
    return data

 

Optionally create a structured array that you can produce a featureclass from using this incarnation.

def dist_bearing(orig=(0, 0), bearings=None, dists=None, prn=False):
    """point locations given distance and bearing
    """

    orig = np.array(orig)
    rads = np.deg2rad(bearings)
    dx = np.sin(rads) * dists
    dy = np.cos(rads) * dists
    x_t = np.cumsum(dx) + orig[0]
    y_t = np.cumsum(dy) + orig[1]
    xy_f = np.array(list(zip(x_t[:-1], y_t[:-1])))
    xy_f = np.vstack((orig, xy_f))
    stack = (xy_f[:, 0], xy_f[:, 1], x_t, y_t, dx, dy, dists, bearings)
    data = np.vstack(stack).T
    names = ['X_f', 'Y_f', "X_t", "Yt", "dx", "dy", "dist", "bearing"]
    N = len(names)
    if prn:  # ---- just print the results ----------------------------------
        frmt = "Origin (0,0)\n" + "{:>10s}"*N
        print(frmt.format(*names))
        frmt = "{: 10.2f}"*N
        for i in data:
            print(frmt.format(*i))
        return data
    else:  # ---- produce a structured array from the output ----------------
        kind = ["<f8"]*N
        dt = list(zip(names, kind))
        out = np.zeros((data.shape[0],), dtype=dt)
        for i in range(N):
            nm = names[i]
            out[nm] = data[:, i]
        return out

# --- from the structured array, use NumPyArrayToFeatureclass

 

Again... more later

I rarely get a great idea.  This is another one of those cases.  But I do recognize a great idea when I see it.  Sadly I usually forget it, then rediscover it many months or years later.  I spotted a fantastic idea today and I had to drop everything (between answering questions) and jot it down.

 

The premise that twigged my interest was about how to 'combine' values of elevation, slope and aspect on Stack Overflow.  I did a wowwser!  Terrain derivatives on Stack!  I was enthralled.  The author.... unutbu ... of mysterious persona, put together a short missive using built-in numpy tools.  The code is short and sweet, works with floating point or integer rasters (aka, arrays) and is fast (at least in my testing so far).

 

Before I ruin the rest of my day with some droll corner case I hadn't thought of, I will jot this down now for posterity.

 

The large print

  • This doesn't require a Spatial Analyst License
  • A complete implementation (coming soon) uses arcpy's RasterToNumPyArray and NumPyArrayToRaster to communicate between the thin veil separating Arc* from the array world
  • Rasters/arrays with nodata values will be implemented
  • This is in the initial stages of development of a full-fledged tool, so hold you ...but!... comments

 

Start with 3 small rasters/arrays.  This way, you can see how the system works.

 

Array 'a'Array 'b'
array([[0, 0, 0, 4, 4, 4, 1, 1, 1],
       [0, 0, 0, 4, 4, 4, 1, 1, 1],
       [0, 0, 0, 4, 4, 4, 1, 1, 1],
       [2, 2, 2, 1, 1, 1, 2, 2, 2],
       [2, 2, 2, 1, 1, 1, 2, 2, 2],
       [2, 2, 2, 1, 1, 1, 2, 2, 2],
       [1, 1, 1, 4, 4, 4, 0, 0, 0],
       [1, 1, 1, 4, 4, 4, 0, 0, 0],
       [1, 1, 1, 4, 4, 4, 0, 0, 0]])
array([[0, 0, 0, 1, 1, 1, 2, 2, 2],
       [0, 0, 0, 1, 1, 1, 2, 2, 2],
       [0, 0, 0, 1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4, 5, 5, 5],
       [3, 3, 3, 4, 4, 4, 5, 5, 5],
       [3, 3, 3, 4, 4, 4, 5, 5, 5],
       [0, 0, 0, 1, 1, 1, 2, 2, 2],
       [0, 0, 0, 1, 1, 1, 2, 2, 2],
       [0, 0, 0, 1, 1, 1, 2, 2, 2]])

 

Array 'c'unique combinations
array([[0, 0, 0, 3, 3, 3, 0, 0, 0],
       [0, 0, 0, 3, 3, 3, 0, 0, 0],
       [0, 0, 0, 3, 3, 3, 0, 0, 0],
       [1, 1, 1, 4, 4, 4, 1, 1, 1],
       [1, 1, 1, 4, 4, 4, 1, 1, 1],
       [1, 1, 1, 4, 4, 4, 1, 1, 1],
       [2, 2, 2, 5, 5, 5, 2, 2, 2],
       [2, 2, 2, 5, 5, 5, 2, 2, 2],
       [2, 2, 2, 5, 5, 5, 2, 2, 2]])
array([[0, 0, 0, 6, 6, 6, 1, 1, 1],
       [0, 0, 0, 6, 6, 6, 1, 1, 1],
       [0, 0, 0, 6, 6, 6, 1, 1, 1],
       [2, 2, 2, 7, 7, 7, 3, 3, 3],
       [2, 2, 2, 7, 7, 7, 3, 3, 3],
       [2, 2, 2, 7, 7, 7, 3, 3, 3],
       [4, 4, 4, 8, 8, 8, 5, 5, 5],
       [4, 4, 4, 8, 8, 8, 5, 5, 5],
       [4, 4, 4, 8, 8, 8, 5, 5, 5]],

 

So here it is.

def combine_(*arrs):
    """Combine arrays to produce a unique classification scheme
    :
    :References:
    :-----------
    : https://stackoverflow.com/questions/48035246/
    :       intersect-multiple-2d-np-arrays-for-determining-zones
    : original def find_labels(*arrs):
    """

    indices = [np.unique(arr, return_inverse=True)[1] for arr in arrs]
    M = np.array([item.max()+1 for item in indices])
    M = np.r_[1, M[:-1]]
    strides = M.cumprod()
    indices = np.stack(indices, axis=-1)
    vals = (indices * strides).sum(axis=-1)
    uniqs, labels = np.unique(vals, return_inverse=True)
    labels = labels.reshape(arrs[0].shape)
    return labels, uniqs

 

Now stop reading here if you aren't interested in the details, but have a quasi-firm grasp on the process.  Read on if you must, but suffice to say unubtu was on to something.

 

-----------------------------  details .... details .... details... read at your peril  ------------------------------------------------------------

In essence, what the code does, is find the unique values of each array shoved into the function.  In our case, the list of unique values is short (see line 3 below).  Not too onerous a combination slate.

Line 10 can be modified to show the unique values in each array... useful if they aren't sequential.  This is show below in line 1.

induni = [np.unique(arr, return_inverse=True)[0] for arr in arrs]  # arrays a, b, c

[array([0, 1, 2, 4]), array([0, 1, 2, 3, 4, 5]), array([0, 1, 2, 3, 4, 5])]

 

The nifty bit, is how unutbu constructs the 'indices' in two steps, using the maximum value in each array, determines the unique combination, applies it to the arrays to get a unique value ('val') for each combination and then reassigns a class ('label' to it).

 

This process, not documented in the code, is expanded upon here.  In lines 1-4 below, an array, combos, is constructed from the values in the array plus the 'val' derived from the above process.

idx = [np.unique(arr, return_inverse=True)[1] for arr in arrs]
vals = (indices * strides).sum(axis=-1)
dt = [('a', '<i8'), ('b', '<i8'), ('c', '<i8'), ('vals', '<i8')]
combos = np.array(list(zip(*idx)), dtype= dt)

 

If the 'combos' array is sorted, the unique values can be extracted from it.  The values in the 'vals' column is used to assign a 'label' to each unique combination.

srt_order = np.argsort(combos, order=['a','b', 'c'])  # sort the combos

srted = combos[srt_order]  # ---- extract them in sorted order

np.unique(srted)  # ---- determine the unique combinations

array([(0, 0, 0,   0),
       (0, 2, 2,  56),
       (1, 0, 2,  49),
       (1, 2, 0,   9),
       (1, 4, 4, 113),
       (2, 3, 1,  38),
       (2, 5, 1,  46),
       (3, 1, 3,  79),
       (3, 1, 5, 127)],
      dtype=[('a', '<i8'), ('b', '<i8'),
             ('c', '<i8'), ('vals', '<i8')])

 

Hope it is clear as mud, and I have some experimenting to do, but I just thought I would throw it out there.

Yup... adding points to poly* features

Already offered... Densify at all license levels?  NO!

My missive .... Densification by Factor  YES!

 

 

There is also a Geodesic Densify, but I don't do Geographic in any event, so use esri tools

Why did I reinvent the wheel?  Because the densification basis is to simply provide a sequence of locations, for X and Y then subdivide the intervening location by some factor (hence the ... by Factor)

 

Here is an example....

a = np.array([[ 20.,  20.],
              [ 20.,  30.],
              [ 30.,  30.],
              [ 30.,  20.],
              [ 20.,  20.]])

 

Then you run some code

def _densify_2D(a, fact=2):
    """Densify a 2D array using np.interp.
    """

    # Y = a changed all the y's to a
    a = np.squeeze(a)
    n_fact = len(a) * fact
    b = np.arange(0, n_fact, fact)
    b_new = np.arange(n_fact - 1)
    c0 = np.interp(b_new, b, a[:, 0])
    c1 = np.interp(b_new, b, a[:, 1])
    n = c0.shape[0]
    c = np.zeros((n, 2))
    c[:, 0] = c0
    c[:, 1] = c1
    return c

And you get this.

a_dens = _densify_2D(a, fact=2)

a_dens

array([[ 20.,  20.],
       [ 20.,  25.],
       [ 20.,  30.],
       [ 25.,  30.],
       [ 30.,  30.],
       [ 30.,  25.],
       [ 30.,  20.],
       [ 25.,  20.],
       [ 20.,  20.]])
a.shape  # (49874, 2)

%timeit _densify_2D(a, fact=2)

5.25 ms ± 323 µs per loop
     (mean ± std. dev. of 7 runs,
      100 loops each)

 

 

Where it really shines, is for long features that just need to be densified without any particular requirement for a set spacing.  If is fairly speedy as well.

Hope someone finds this useful, if you don't have the Standard or Advanced license in particular.

This is the current state of the Point and PointGeometry .... not the best....

pt2 = arcpy.Point()

pt2
<Point (0.0, 0.0, #, #)>        # ---- make a point

pg2 = arcpy.PointGeometry(pt2)  # ---- make a point geometry

pg2.WKT                         # ---- why does it have valid X, Y values????
'POINT (0 0)'

So there is no such thing as a 'null' point (but a 'nominal' point as one of our esteemed participants noted). 

'None' isn't good enough.  You can create an 'empty' geometry recognized by other representations with a little trickery.

import arcpy

n = np.nan

a = np.array([n, n, n, n])

pt = arcpy.Point()

pt.X, pt.Y, pt.Z, pt.M = a

pg = arcpy.PointGeometry(pt)

pg
<PointGeometry object at 0x200c6ed6ac8[0x200bf9a39e0]>

pg.WKT
'POINT EMPTY'

pt
<Point (nan, nan, nan, nan)>

 

And when you look at the PointGeometry, you decide which is best\.

# ---- by using NaN (not a number) for point properties ----
pt        # <Point (nan, nan, nan, nan)>

pg = arcpy.PointGeometry(pt)

pg.JSON   # '{"x":"NaN","y":"NaN","spatialReference":{"wkid":null}}'

pg.WKT    #  'POINT EMPTY'

# ---- The current state of affairs ----

pt2       # <Point (0.0, 0.0, #, #)>

pg2 = arcpy.PointGeometry(pt2)

pg2.JSON  # '{"x":0,"y":0,"spatialReference":{"wkid":null}}'

pg2.WKT   # 'POINT (0 0)'

 

I should point out that you can use math.nan inplace of np.nan

m = math.nan
b = np.array([m, m, m, m])
pt3 = arcpy.Point()
pt3.X, pt3.Y, pt3.Z, pt3.M = b
pt3  # ---- <Point (nan, nan, nan, nan)>

pg3 = arcpy.PointGeometry(pt3)

pg3.JSON  # ---- '{"x":"NaN","y":"NaN","spatialReference":{"wkid":null}}'

pg3.WKT   # ---- 'POINT EMPTY'

Natural terrain is so fickle.  Wouldn't it be nice to have a sink-less terrain so that your flowdirection, flowaccumulation and stream generation would go as nature planned.

 

If you have a need to test algorithms, then natural surfaces aren't the best, but generating artificial terrain can be a pain.  They tend to have repetitive patterns and lack the finesse that real surface has.  

 

This is a brief look at one of the many algorithms that have been employed in early gaming software and largely ignored by the 'derivative' people.  The algorithm is pretty simple.... provide random seeds to the corners of a square array/raster, then divide it into triangles and squares in a predictable fashion calculating the averages of the intervening space, assigning the central location, the value of the pattern points.  Now, to mix it up, one can add some randomness to the calculated values to 'roughen' the surface to so speak.

 

The following is the sequence of the pattern generation... the code is on my GitHub account and the Code Sharing site for those that want to explore on their own.  I repurposed an implementation adding a few twists, like leaving out the random jitters and the ability to replicate in both rows and columns should you want to generate a surface with some repetition.

 

More importantly, this is for fun and is a good way to learn how to use numpy arrays to work with array data that forms images or rasters in the GIS sense of the term.

 

Input template array with 4 corners defined...  The empty space is represented
by 'nan' aka, 'not a number', you could use zero, but 'not a number' is a
raster null, without having to get into the issues of masks and the like.

Step 1... throw in 4 corners

[[ 0.250  nan  nan  nan  nan  nan  nan  nan  0.750]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.500  nan  nan  nan  nan  nan  nan  nan  1.000]]

Step 2... calculate the central location...

[[ 0.250  nan  nan  nan  nan  nan  nan  nan  0.750]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  0.625  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.500  nan  nan  nan  nan  nan  nan  nan  1.000]]

Step 3...
Split into the diamond structure and calculate the compass points

[[ 0.250  nan  nan  nan  0.542  nan  nan  nan  0.750]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.458  nan  nan  nan  0.625  nan  nan  nan  0.792]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.500  nan  nan  nan  0.708  nan  nan  nan  1.000]]

Step 4....
Calculate the central values for the squares

[[ 0.250  nan  nan  nan  0.542  nan  nan  nan  0.750]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  0.469  nan  nan  nan  0.677  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.458  nan  nan  nan  0.625  nan  nan  nan  0.792]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ nan  nan  0.573  nan  nan  nan  0.781  nan  nan]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.500  nan  nan  nan  0.708  nan  nan  nan  1.000]]

Step ....
Repeat......
[[ 0.250  nan  0.420  nan  0.542  nan  0.656  nan  0.750]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.392  nan  0.469  nan  0.578  nan  0.677  nan  0.740]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.458  nan  0.531  nan  0.625  nan  0.719  nan  0.792]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.510  nan  0.573  nan  0.672  nan  0.781  nan  0.858]
[ nan  nan  nan  nan  nan  nan  nan  nan  nan]
[ 0.500  nan  0.594  nan  0.708  nan  0.830  nan  1.000]]

Repeat
[[ 0.250  nan  0.420  nan  0.542  nan  0.656  nan  0.750]
[ nan  0.383  nan  0.502  nan  0.613  nan  0.706  nan]
[ 0.392  nan  0.469  nan  0.578  nan  0.677  nan  0.740]
[ nan  0.463  nan  0.551  nan  0.650  nan  0.732  nan]
[ 0.458  nan  0.531  nan  0.625  nan  0.719  nan  0.792]
[ nan  0.518  nan  0.600  nan  0.699  nan  0.787  nan]
[ 0.510  nan  0.573  nan  0.672  nan  0.781  nan  0.858]
[ nan  0.544  nan  0.637  nan  0.748  nan  0.867  nan]
[ 0.500  nan  0.594  nan  0.708  nan  0.830  nan  1.000]]


Step.... finally... all the squares and triangles are filled in

[[ 0.250  0.351  0.420  0.488  0.542  0.604  0.656  0.704  0.750]
[ 0.342  0.383  0.443  0.502  0.559  0.613  0.663  0.706  0.732]
[ 0.392  0.427  0.469  0.525  0.578  0.630  0.677  0.714  0.740]
[ 0.438  0.463  0.503  0.551  0.601  0.650  0.694  0.732  0.754]
[ 0.458  0.493  0.531  0.577  0.625  0.673  0.719  0.757  0.792]
[ 0.496  0.518  0.556  0.600  0.649  0.699  0.747  0.787  0.812]
[ 0.510  0.536  0.573  0.620  0.672  0.725  0.781  0.823  0.858]
[ 0.518  0.544  0.587  0.637  0.691  0.748  0.807  0.867  0.908]
[ 0.500  0.546  0.594  0.646  0.708  0.762  0.830  0.899  1.000]]

 

Of course, the image that follows uses a much larger array (1025x1025).  The generated array, was sink-less, and flow properties and a stream network was generated from the result.  Pretty smooth eh?

 

Next up... terrain generation from polygons and other vector geometry

Filter

Started this with rolling stats. 

 

Rolling Statistics for grid data... an alternative

 

Once you have a filter... kernel whatever you want to call it, all you need is a moving window of some form and shape.  I won't go into those details since my previous work covers this.  I will just report a few simple ones, and show you some of the possible kernels that you can create.

 

# ---- my input array ----

Input array.......
-shape (1, 10, 10), ndim 3
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4

From the above, you can see that I kept it simple... 4 blocks of values, the perfect study area.

 

Let's see if we can pick up some lines that divide the raster/array

Sobel horizonal filter...
-shape (1, 3, 3), ndim 3

  . -1 -2 -1 
  .  0  0  0 
  .  1  2  1 


Sobel horizontal result.......
-shape (1, 10, 10), ndim 3, masked array, fill value 999999

.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  8  8  8  8  8  8  8  8 -1 
.... -1  8  8  8  8  8  8  8  8 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

See those 8's! nailed the horizontal divide.

 

vertical filter...

  . -1  0  1 
  . -2  0  2 
  . -1  0  1 

Sobel vertical result.......
-shape (1, 10, 10), ndim 3, masked array, fill value 999999
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

 

So easy... vertical nailed as well.

 

One more...

Array...
-shape (1, 3, 3), ndim 3
  . -1 -1  0 
  . -1  0  1 
  .  0  1  1 

Emboss vertical result.......

-shape (1, 10, 10), ndim 3, masked array, fill value 999999
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  4  4  4  6  6  4  4  4 -1 
.... -1  4  4  4  6  6  4  4  4 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

 

How about some filters that I have found.

    n = np.nan
    all_f = [1, 1, 1, 1, 1, 1, 1, 1, 1]
    no_cnt = [1, 1, 1, 1, n, 1, 1, 1, 1]
    cross_f = [1, 0, 1, 0, 0, 0, 1, 0, 1]
    plus_f = [0, 1, 0, 1, 0, 1, 0, 1, 0]
    grad_e = [1, 0, -1, 2, 0, -2, 1, 0, -1]
    grad_n = [-1, -2, -1, 0, 0, 0, 1, 2, 1]
    grad_ne = [0, -1, -2, 1, 0, -1, 2, 1, 0]
    grad_nw = [2, -1, 0, -1, 0, 1, 0, 1, 2]
    grad_s = [1, 2, 1, 0, 0, 0, -1, -2, -1]
    grad_w = [-1, 0, 1, -2, 0, 2, -1, 0, 1]
    lap_33 = [0, -1, 0, -1, 4, -1, 0, -1, 0]
    line_h = [-1, -1, -1, 2, 2, 2, -1, -1, -1]
    line_ld = [2, -1, -1, -1, 2, -1, -1, -1, 2]
    line_rd = [-1, -1, 2, -1, 2, -1, 2, -1, -1]
    line_v = [-1, 0, -1, -1, 2, -1, -1, 2, -1]
    high = [-0.7, -1.0, -0.7, -1.0, 6.8, -1.0, -0.7, -1.0, -0.7]
    sob_hor = [-1, -2, -1, 0, 0, 0, 1, 2, 1]   # sobel y
    sob_vert = [-1, 0, 1, -2, 0, 2, -1, 0, 1]  # sobel x
    emboss = [-1, -1, 0, -1, 0, 1, 0, 1, 1]
    sharp1 = [0., -0.25, 0., -0.25, 2.0, -0.25, 0., -0.25, 0.]
    sharp2 = [-0.25, -0.25, -0.25, -0.25, 3.0, -0.25, -0.25, -0.25, -0.25]
    sharp3 = [-1, -1, -1, -1, 9, -1, -1, -1, -1]
    lowpass = [1, 2, 1, 2, 4, 2, 1, 2, 1]

    # ---- assemble the dictionary ----

    d = {1: all_f, 2: no_cnt, 3: cross_f, 4: plus_f, 5: grad_e,
         6: grad_n, 7: grad_ne, 8: grad_nw, 9: grad_s, 10: grad_w,
         11: lap_33, 12: line_h, 13: line_ld, 14: line_rd, 15: line_v,
         16: high, 17: sob_hor, 18: sob_vert, 19: emboss, 20: sharp1,
         21: sharp2, 23: sharp3, 24: lowpass}

    filter_ = np.array(d[mode]).reshape(3, 3)

You call the filter using the mode number (ie d[23] is sharp3.

import numpy as np
from numpy.lib.stride_tricks import as_strided

def stride(a, win=(3, 3), stepby=(1, 1)):
    """Provide a 2D sliding/moving view of an array.
    :  There is no edge correction for outputs. Use _pad first.
    """

    err = """Array shape, window and/or step size error.
    Use win=(3,3) with stepby=(1,1) for 2D array
    or win=(1,3,3) with stepby=(1,1,1) for 3D
    ----    a.ndim != len(win) != len(stepby) ----
    """

    assert (a.ndim == len(win)) and (len(win) == len(stepby)), err
    shape = np.array(a.shape)  # array shape (r, c) or (d, r, c)
    win_shp = np.array(win)    # window      (3, 3) or (1, 3, 3)
    ss = np.array(stepby)      # step by     (1, 1) or (1, 1, 1)
    newshape = tuple(((shape - win_shp) // ss) + 1)
    newshape += tuple(win_shp)
    newstrides = tuple(np.array(a.strides) * ss) + a.strides
    a_s = as_strided(a, shape=newshape, strides=newstrides).squeeze()
    return a_s


def a_filter(a, mode=1, ignore_nodata=True, nodata=None):
    """put the filter stuff below this line"""
    #
    # ---- stride the input array ----
    a_strided = stride(a)
    if ignore_nodata:
        c = (np.sum(a_strided * filter_, axis=(2, 3)))
    else:
        c = (np.nansum(a_strided * filter_, axis=(2, 3)))
    pad_ = nodata
    if nodata is None:
        if c.dtype.name in ('int', 'int32', 'int64'):
            pad_ = min([0, -1, a.min()-1])
        else:
            pad_ = min([0.0, -1.0, a.min()-1])
    c = np.lib.pad(c, (1, 1), "constant", constant_values=(pad_, pad_))
    m = np.where(c == pad_, 1, 0)
    c = np.ma.array(c, mask=m, fill_value=None)
    return filter_, c

# ----------------------------------------------------------------------
# __main__ .... code section
if __name__ == "__main__":
    """Optionally...
    : - print the script source name.
    : - run the _demo
    """

#    you can put your stuff here... use RasterToNumPyArray to get an array

 

So that is pretty well some other stuff.

I should point out that the same process can be used for statistical calculations, terrain derivatives (slope, aspect, hillshade, curvature etc ) and I have reported on these before.

 

Now I should point out that none of the above requires the Spatial Analyst extension and the RasterToNumPyArray and NumPyArrayToRaster are provided with arcpy.da 's module.  Lots of possibilities, lots of capability is built-in.

 

More later.

Rolling stats...  The 3x3 moving/sliding/rolling window.. This kernel type will focus on returning statistics like your FocalStatistics functions... but without needing the Spatial Analyst extension...

 

They are used ....

And there are other uses....

 

First the results.

a      # ---- A grid/raster/array ----

array([[5, 1, 2, 0, 5, 4, 7, 3, 2, 2],
       [2, 5, 9, 9, 6, 9, 2, 1, 3, 0],
       [7, 2, 7, 4, 4, 4, 3, 7, 7, 7],
       [3, 9, 6, 8, 0, 1, 8, 5, 0, 7],
       [8, 4, 6, 8, 1, 9, 6, 9, 5, 5],
       [3, 9, 9, 5, 3, 1, 5, 5, 2, 3],
       [0, 5, 2, 0, 9, 4, 7, 9, 9, 0],
       [2, 5, 5, 4, 3, 7, 7, 8, 3, 5],
       [5, 2, 4, 5, 3, 6, 5, 2, 7, 2],
       [4, 6, 4, 9, 8, 6, 8, 6, 2, 0]])


a_s = stride(a, r_c=(3,3))  # ---- create an array of rolling windows 3x3 ----

# ---- Now calculate all the statistics --------------------------------------

rolling_stats(a_s, no_null=False, prn=True)
Min...
[[1 0 0 0 2 1 1 0]
[2 2 0 0 0 1 0 0]
[2 2 0 0 0 1 0 0]
[3 4 0 0 0 1 0 0]
[0 0 0 0 1 1 2 0]
[0 0 0 0 1 1 2 0]
[0 0 0 0 3 2 2 0]
[2 2 3 3 3 2 2 0]]
Max...
[[9 9 9 9 9 9 7 7]
[9 9 9 9 9 9 8 7]
[9 9 8 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[5 5 9 9 9 9 9 9]
[6 9 9 9 8 8 8 8]]
Mean...
[[ 4.44  4.33  5.11  5.    4.89  4.44  3.89  3.56]
[ 5.56  6.56  5.89  5.    4.11  4.44  4.    4.11]
[ 5.78  6.    4.89  4.33  4.    5.78  5.56  5.78]
[ 6.33  7.11  5.11  4.    3.78  5.44  5.    4.56]
[ 5.11  5.33  4.78  4.44  5.    6.11  6.33  5.22]
[ 4.44  4.89  4.44  4.    5.11  5.89  6.11  4.89]
[ 3.33  3.56  3.89  4.56  5.67  6.11  6.33  5.  ]
[ 4.11  4.89  5.    5.67  5.89  6.11  5.33  3.89]]
Med...
[[ 5.  4.  5.  4.  4.  4.  3.  3.]
[ 6.  7.  6.  4.  4.  4.  3.  5.]
[ 6.  6.  6.  4.  4.  6.  6.  7.]
[ 6.  8.  6.  3.  3.  5.  5.  5.]
[ 5.  5.  5.  4.  5.  6.  6.  5.]
[ 5.  5.  4.  4.  5.  7.  7.  5.]
[ 4.  4.  4.  4.  6.  7.  7.  5.]
[ 4.  5.  4.  6.  6.  6.  6.  3.]]
Sum...
[[40 39 46 45 44 40 35 32]
[50 59 53 45 37 40 36 37]
[52 54 44 39 36 52 50 52]
[57 64 46 36 34 49 45 41]
[46 48 43 40 45 55 57 47]
[40 44 40 36 46 53 55 44]
[30 32 35 41 51 55 57 45]
[37 44 45 51 53 55 48 35]]
Std...
[[ 2.67  3.2   2.85  2.62  2.02  2.5   2.28  2.59]
[ 2.59  2.36  2.73  3.09  2.88  2.83  2.71  2.96]
[ 2.2   2.16  2.73  3.16  2.98  2.62  2.59  2.39]
[ 2.4   1.79  3.    3.37  3.15  2.83  2.58  2.5 ]
[ 3.    2.91  3.26  3.34  2.87  2.56  2.26  3.08]
[ 2.91  2.73  2.83  2.62  2.42  2.28  2.38  3.03]
[ 1.76  1.71  2.33  2.45  1.94  2.02  2.36  3.2 ]
[ 1.29  1.79  2.    2.    1.79  1.73  2.31  2.56]]
Var...
[[  7.14  10.22   8.1    6.89   4.1    6.25   5.21   6.69]
[  6.69   5.58   7.43   9.56   8.32   8.02   7.33   8.77]
[  4.84   4.67   7.43  10.     8.89   6.84   6.69   5.73]
[  5.78   3.21   8.99  11.33   9.95   8.02   6.67   6.25]
[  8.99   8.44  10.62  11.14   8.22   6.54   5.11   9.51]
[  8.47   7.43   8.02   6.89   5.88   5.21   5.65   9.21]
[  3.11   2.91   5.43   6.02   3.78   4.1    5.56  10.22]
[  1.65   3.21   4.     4.     3.21   2.99   5.33   6.54]]
Range

# ---- If you have nodata values, they can be accommodated ------------------

 

Now the code functions

 

import numpy as np
from numpy.lib.stride_tricks import as_strided

def _check(a, r_c, subok=False):
    """Performs the array checks necessary for stride and block.
    : a   - Array or list.
    : r_c - tuple/list/array of rows x cols.
    : subok - from numpy 1.12 added, keep for now
    :Returns:
    :------
    :Attempts will be made to ...
    :  produce a shape at least (1*c).  For a scalar, the
    :  minimum shape will be (1*r) for 1D array or (1*c) for 2D
    :  array if r<c.  Be aware
    """

    if isinstance(r_c, (int, float)):
        r_c = (1, int(r_c))
    r, c = r_c
    if a.ndim == 1:
        a = np.atleast_2d(a)
    r, c = r_c = (min(r, a.shape[0]), min(c, a.shape[1]))
    a = np.array(a, copy=False, subok=subok)
    return a, r, c, tuple(r_c)


def _pad(a, nan_edge=False):
    """Pad a sliding array to allow for stats"""
    if nan_edge:
        a = np.pad(a, pad_width=(1, 2), mode="constant",
                   constant_values=(np.NaN, np.NaN))
    else:
        a = np.pad(a, pad_width=(1, 1), mode="reflect")
    return a


def stride(a, r_c=(3, 3)):
    """Provide a 2D sliding/moving view of an array.
    :  There is no edge correction for outputs.
    :
    :Requires:
    :--------
    : _check(a, r_c) ... Runs the checks on the inputs.
    : a - array or list, usually a 2D array.  Assumes rows is >=1,
    :     it is corrected as is the number of columns.
    : r_c - tuple/list/array of rows x cols.  Attempts  to
    :     produce a shape at least (1*c).  For a scalar, the
    :     minimum shape will be (1*r) for 1D array or 2D
    :     array if r<c.  Be aware
    """

    a, r, c, r_c = _check(a, r_c)
    shape = (a.shape[0] - r + 1, a.shape[1] - c + 1) + r_c
    strides = a.strides * 2
    a_s = (as_strided(a, shape=shape, strides=strides)).squeeze()
    return a_s


# ----------------------------------------------------------------------
# (18) rolling stats .... code section
def rolling_stats(a, no_null=True, prn=True):
    """Statistics on the last two dimensions of an array.
    :Requires
    :--------
    : a - 2D array  **Note, use 'stride' above to obtain rolling stats
    : no_null - boolean, whether to use masked values (nan) or not.
    : prn - boolean, to print the results or return the values.
    :
    :Returns
    :-------
    : The results return an array of 4 dimensions representing the original
    : array size and block size
    : eg.  original = 6x6 array   block = 3x3
    :      breaking the array into 4 chunks
    """

    a = np.asarray(a)
    a = np.atleast_2d(a)
    ax = None
    if a.ndim > 1:
        ax = tuple(np.arange(len(a.shape))[-2:])
    if no_null:
        a_min = a.min(axis=ax)
        a_max = a.max(axis=ax)
        a_mean = a.mean(axis=ax)
        a_med = np.median(a, axis=ax)
        a_sum = a.sum(axis=ax)
        a_std = a.std(axis=ax)
        a_var = a.var(axis=ax)
        a_ptp = a_max - a_min
    else:
        a_min = np.nanmin(a, axis=(ax))
        a_max = np.nanmax(a, axis=(ax))
        a_mean = np.nanmean(a, axis=(ax))
        a_med = np.nanmedian(a, axis=(ax))
        a_sum = np.nansum(a, axis=(ax))
        a_std = np.nanstd(a, axis=(ax))
        a_var = np.nanvar(a, axis=(ax))
        a_ptp = a_max - a_min
    if prn:
        s = ['Min', 'Max', 'Mean', 'Med', 'Sum', 'Std', 'Var', 'Range']
        frmt = "...\n{}\n".join([i for i in s])
        v = [a_min, a_max, a_mean, a_med, a_sum, a_std, a_var, a_ptp]
        args = [indent(str(i), '... ') for i in v]
        print(frmt.format(*args))
    else:
        return a_min, a_max, a_mean, a_med, a_sum, a_std, a_var, a_ptp

The interlude... An example of searchcursors and their array rootsCurses at cursors... pretty well a regular occurrence on this site.  People love them and love to hate them.

They try to nest them, and nest them within 'for' loops and 'with' statements with calls that sound poetic ( ... for row in rows, with while for arc thou ... )

 

There are old cursors and new cursors (the da-less and the da cursors).  Cursors appear in other guises such as the new, and cleverly named, 'arcgis' module (digression # 1 ... really? something else with arcgis in it! Who is in charge of branding).

 

Perhaps cursors are cloaked, in other arcpy and  data access module methods (ie. blank-to-NumPyArray and NumPyArray-to-blank).  Who knows for sure since much is locked in arcgisscripting.pyd.

 

Sadly, we deal in a work of mixed data types.  Our tables contain columns of attributes, organized sequentially by rows.  Sometimes the row order has meaning, sometimes not.   Each column contains one data type in a well-formed data structure.  This is why spreadsheets are purely evil for trying to create and maintain data structure, order and form (you can put anything anywhere).

 

The interlude... An example of searchcursors and their array roots
in_tbl = r"C:\Git_Dan\arraytools\Data\numpy_demos.gdb\sample_10k"

sc = arcpy.da.SearchCursor(in_tbl, "*")             # the plain searchcursor
a = arcpy.da.SearchCursor(in_tbl, "*")._as_narray() # using one of its magic methods
b = arcpy.da.TableToNumPyArray(in_tbl, "*")         # using the cloak

np.all(a == b)  # True           ..... everything is equal .......

sc._dtype           # .... here are the fields and dtype in the table ....
  dtype([('OBJECTID', '<i4'), ('f0', '<i4'), ('County', '<U4'),
         ('Town', '<U12'), ('Facility', '<U16'), ('Time', '<i4'),
         ('Test', '<U24')])

sc.fields       # .... just the field names, no need for arcpy.ListFields(...) .....
  ('OBJECTID', 'f0', 'County', 'Town', 'Facility', 'Time', 'Test')


# ---- Some timing... 10,000 records ----
%timeit arcpy.da.SearchCursor(in_tbl, "*")
153 ms ± 12.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit arcpy.da.SearchCursor(in_tbl, "*")._as_narray()
40.4 ms ± 970 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit arcpy.da.TableToNumPyArray(in_tbl, "*")
52.1 ms ± 9.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

 

If someone can explain why the plain searchcursor is slower than its dressed up (or down?) counterparts, I would love to hear about it .

 

 

Back to the main event

Harkening back to the fields of mathematics, arrays are assemblages of data, in 1, 2, 3 or more dimensions.  If an array of any dimension has a uniform data type, then life is easier from a structural and usage perspective (this is one reason why Remote Sensing is easier than GIS ... bring on the mail).  We need to maintain an index which ties our geometry to our attributes so what goes where, and where is what, doesn't get mixed up (digression # 2... I am sure this isn't what the branders meant by The Science of Where but we can only hope)

 

Nerd Stuff 

Enough with the boring stuff... bring on the code.

Some of this has been looked from a slightly different perspective in 

 

Get to the Points... arcpy, numpy, pandas

We need some data to work with so... how about a square.

    in_fc = r"C:\Your_spaceless_path\Your.gdb\square"

The 'Describe' object

 

The 'describe' object does just that: describes an object, in this case a FeatureClass.
desc = arcpy.da.Describe(in_fc)
In the newer arcpy.da module, the values can be accessed from a dictionary.  A quick way to get the sorted dictionary keys is to use a list comprehension.  If you want the values, then you can obtain them in a similar fashion.
sk = sorted([k for k in desc.keys()])  # sorted keys
kv = [(k, desc[k]) for k in sk]  # key/value pairs
kv = "\n".join(["{!s:<20} {}".format(k, desc[k]) for k in sk])
Some useful keys associated with featureclasses are extracted as follows:
With appropriate snips in the full list

[..., 'MExtent', 'OIDFieldName', 'ZExtent', 'aliasName', 'areaFieldName',
  'baseName', ... 'catalogPath', ... 'dataElementType', 'dataType',
  'datasetType', ... 'extent', 'featureType', 'fields', 'file', ... 'hasM',
  'hasOID', 'hasSpatialIndex', 'hasZ', 'indexes', ... 'lengthFieldName',
  ... 'name', 'path', ... 'shapeFieldName', 'shapeType', 'spatialReference',
  ...]

The Cursor object
A cursor gives access to the geometry and attributes of a featureclass.  It is always recommended to use the spatial reference when creating the cursor.  Historically (fixed?), if it was omitted, geometry discrepancies would arise. This information can easily be obtained from the describe object from the previous section.
SR = desc['spatialReference']  # Get the search cursor object.
flds = "*"
args = [in_fc, flds, None, SR, True, (None, None)]
cur = arcpy.da.SearchCursor(*args)
See what it reveals...
dir(cur)

['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__',
'__esri_toolinfo__', '__exit__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',
'__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'_as_narray', '_dtype', 'fields', 'next', 'reset']
 
Individual properties are:
cur.__class__
<class 'da.SearchCursor'>

cur.__class__.__mro__
(<class 'da.SearchCursor'>, <class 'object'>)
The search cursor inherits from 'object' and the difference in their properties and methods offered by the da.SearchCursor can be determined as follows:
s0 = set(dir(mro[0]))
s1 = set(dir(mro[1]))

sorted(list(set.difference(s0, s1)))

 ['__enter__', '__esri_toolinfo__', '__exit__', '__getitem__', '__iter__',
  '__next__', '_as_narray', '_dtype', 'fields', 'next', 'reset']
 
The cursor offers the means to process its objects.
cur.__esri_toolinfo__
 ['FeatureLayer|Table|TableView|Dataset|FeatureDataset::::', 'String::*::',
  'Python::None::', 'CoordinateSystem::::']
 
This handy one returns a numpy structured/recarray
type(cur._as_narray())
 <class 'numpy.ndarray'>
How about information on the attributes of in_fc (more about this later)
cur._as_narray().__array_interface__

 {'version': 3,
  'strides': None,
  'shape': (0,),
  'typestr': '|V36',
  'descr': [('OBJECTID', '<i4'),
            ('Shape', '<f8', (2,)),
            ('Shape_Length', '<f8'),
            ('Shape_Area', '<f8')],
  'data': (2044504703824, False)}
 
We got a glimpse of the field names and data types from the __array_interface__, but this information can be accessed directly as well.
cur.fields
 ('OBJECTID', 'Shape', 'Shape_Length', 'Shape_Area')

cur._dtype
dtype([('OBJECTID', '<i4'),
       ('Shape', '<f8', (2,)),
       ('Shape_Length', '<f8'),
       ('Shape_Area', '<f8')])    
Now, the gotcha's.  We created our search cursor at the beginning and each record was cycled through until it reached the end.  If we attempt to get its properties we may in for a surprise, so we need to 'reset' the cursor back to the start.
cur._as_narray()  # try to get its properties, all we get is the dtype
array([],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
     ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
Once the cursor is reset, the array values for the square are revealed with the appropriate data type.
cur.reset()
cur._as_narray()  # reset to the beginning  

array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
           ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
    
There is no automatic reset, so be careful.  You can print the objects in the array in a couple of ways.
cur.reset()
for row in cur:
   print(("{} "*len(row)).format(*row))  # print individual elements

  1 (342000.0, 5022000.0) 4000.0 1000000.0
 1 (342000.0, 5023000.0) 4000.0 1000000.0
 1 (343000.0, 5023000.0) 4000.0 1000000.0
 1 (343000.0, 5022000.0) 4000.0 1000000.0
 1 (342000.0, 5022000.0) 4000.0 1000000.0
Resetting the cursor, and print again.
cur.reset()
for row in cur:
    print(row)  # print the whole row as a tuple

   (1, (342000.0, 5022000.0), 4000.0, 1000000.0)
   (1, (342000.0, 5023000.0), 4000.0, 1000000.0)
   (1, (343000.0, 5023000.0), 4000.0, 1000000.0)
   (1, (343000.0, 5022000.0), 4000.0, 1000000.0)
   (1, (342000.0, 5022000.0), 4000.0, 1000000.0)
Of course since generator-like objects can be converted to a list, that can be done as an alternative, particularly if you have the memory and wish to deal with list objects instead.
cur.reset()
list(cur)
 [(1, (342000.0, 5022000.0), 4000.0, 1000000.0),
  (1, (342000.0, 5023000.0), 4000.0, 1000000.0),
  (1, (343000.0, 5023000.0), 4000.0, 1000000.0),
  (1, (343000.0, 5022000.0), 4000.0, 1000000.0),
  (1, (342000.0, 5022000.0), 4000.0, 1000000.0)]
 
So if you know the data type of the components of the cursor, you can go to the ndarray in an indirect fashion.
cur.reset()
dt = cur._dtype
c_lst = list(cur)

np.asarray(c_lst, dtype=dt)

array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
           ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
The ndarray can be viewed as a record array.  Since the data type and structure remain the same, a 'view' of the array as a record array (recarray).  Record arrays allow the user to slice the array using conventional array slicing or by object dot notation.  
a = a.view(np.recarray)

a.Shape == a['Shape']  # check to see if slicing equals dot notation

array([[ True,  True],
      [ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True]], dtype=bool)
Or more simply...
np.all(a.Shape == a['Shape'])
True       

a.Shape  # or a['Shape']

array([[  342000.,  5022000.],
       [  342000.,  5023000.],
       [  343000.,  5023000.],
       [  343000.,  5022000.],
       [  342000.,  5022000.]])
 
You can calculate the properties of the objects simply, but in the case of polygons, the duplicate start/end point should be reduced to a singleton.  In the examples, the object's shape is obtained, then the desired property is derived on a column basis.
pnts = a.Shape[:-1]       # get the unique points

cent = pnts.mean(axis=0)  # return the mean by column

cent array([  342500.,  5022500.])
 
With some fancy work, and calling one of my previously defined array functions in the 'arraytools' module, you can do things like determine interpoint distances.
import arraytools as art

art.e_dist(cent, pnts)

array([ 707.11,  707.11,  707.11,  707.11])
Which is correct given the square polygon's shape.
Another example, to demonstrate array functions.  In the case of polygons, it is useful to have the first and last point (ie duplicates) retained to ensure closure of the polygon. 
poly = a.Shape

art.e_leng(poly)  # method to return polygon perimeter/length, total, then by segment

 (4000.0, [array([[ 1000.,  1000.,  1000.,  1000.]])])

art.e_area(poly)

 1000000.0
 
------------------------------------------------------------------------------ 
Working with cursors
--------------------
Cursors can access the columns in the tabular data in a variety of ways.  One of the easiest to follow is to simply refer to the columns by the order in which they appear when they were retrieved.  This is fine if one writes scripts in the same way.  In the example that follows, the list of fields to be used with the cursor operations is defined as:
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']

 

When using the above notation, the position of the fields is used to reference their values.   So you may see code that uses ' for row in cursor '  with row[0] being the feature object id (OID@) and row[3] being the value from an integer field (Int_fld).  If you are like me, anything beyond 2, means you are finger counting remembering the python counting is zero-based.  I now prefer to spend the extra time assigning variable names rather than using positional notation.  You can see this in lines 12-13 below.

in_fc = r'C:\Folder\path_to\A_Geodatabase.gdb\FeatureClass   # or Table

desc = arcpy.Describe(in_fc)
SR = desc.spatialReference
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']
where_clause = None
spatial_reference = SR
explode_to_points = True
sql_clause = (None, None)

results = []
with arcpy.da.SearchCursor(in_tbl, in_flds) as cursor:
    for id, x, y, i_val, f_val, s_val in cursor:
        if id > 10:
            do stuff
        else:
            do other stuff
        results.append(... put the stuff here ...)
return results

-----------------------------------------------------------------------------
arcgisscripting
---------------
arcgisscripting can be located in your ArcGIS Pro distribution once everything is imported (arcgisscripting.__file__).  It is located in the installation path (substitute C:\\ArcPro for your Pro path, everything else is the same)
'C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgisscripting.pyd'
Now importing arcpy, also imports parts of arcgisscripting and it also imports the geoprocessor from
C:\ArcPro\Resources\ArcPy\arcpy\geoprocessing\__init__
which imports _base.py which uses the Geoprocessor class as 'gp'

dir(arcgisscripting)
['ExecuteAbort', 'ExecuteError', 'ExecuteWarning', 'NumPyArrayToRaster', 'Raster', 'RasterToNumPyArray', '__cleanup__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_addTimeInterval', '_analyzeForSD', '_chart', '_convertWebMapToMapDocument', '_createGISServerConnectionFile', '_createGeocodeSDDraft', '_createMapSDDraft', '_createimageservicesddraft', '_getImageEXIFProperties', '_getUTMFromLocation', '_hasLocalFunctionRasterImplementation', '_listDateTimeStringFormats', '_listStyleItems', '_listTimeZones', '_mapping', '_ss', '_wrapLocalFunctionRaster', '_wrapToolRaster', 'arcgis', 'create', 'da', 'getmytoolboxespath', 'getsystemtoolboxespath', 'getsystemtoolboxespaths', 'na']

dir(arcgisscripting.da)
['Describe', 'Domain', 'Editor', 'ExtendTable', 'FeatureClassToNumPyArray', 'InsertCursor', 'ListDomains', 'ListFieldConflictFilters', 'ListReplicas', 'ListSubtypes', 'ListVersions', 'NumPyArrayToFeatureClass', 'NumPyArrayToTable', 'Replica', 'SearchCursor', 'TableToNumPyArray', 'UpdateCursor', 'Version', 'Walk', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_internal_eq', '_internal_sd', '_internal_vb'
References
----------

 

Other discussions

----------

https://community.esri.com/docs/DOC-10416-are-searchcursors-brutally-slow-they-need-not-be

 

----------

More later...

Look

 

Code as verbose as possible.  Based on a generic function that I use.

Rules

  1. Do not have Pro or Map open
  2. Run your code
  3. Open the project when you are done.
  4. Review rules 1-3
  5. You can sort using multiple fields or a single field.  You can reverse sort by just slicing and reversing at the same time... arr_sort = arr_sort[::-1] way quicker than trying to get all the options on a dialog.  This can be done on a column or multiple columns basis.
  6. Still too confusing? .... attach the code to a new script in a toolbox... it doesn't need any parameters, but you have to substitute the paths.  ExtendTable requires a close tie to Pro even if your external python IDE is the editing IDE default when you set up Pro.

ExtendTable will do a permanent join if the script is run from a tool within ArcToolbox, making it a nice alternative to the standard join procedure.  

 

Numpy has functionality within the recfunctions module located at https://github.com/numpy/numpy/blob/master/numpy/lib/recfunctions.py 

These would include

__all__ = [
'append_fields', 'drop_fields', 'find_duplicates',
'get_fieldstructure', 'join_by', 'merge_arrays',
'rec_append_fields', 'rec_drop_fields', 'rec_join',
'recursive_fill_fields', 'rename_fields', 'stack_arrays',
]

So you have options to do your work within or without Arc* depending on what is the easiest in the workflow

 

Less than 1000 words too.

 

Python ArcGIS Pro

Let's get to the point... in this case, the points of a square.

 

We can convert it to a structured array using nothing but arcpy methods, a mix of arcpy and numpy and more recently, and extension into Pandas.

 

Here is what they look like using various functions.

-----------------------------------------------------------------------------------------------------------------------------------------------------------

1.  Rudimentary steps

in_fc = r"C:\Git_Dan\a_Data\testdata.gdb\square"
flds = ['SHAPE@X', 'SHAPE@Y']
cur = arcpy.da.SearchCursor(in_fc, flds, None, None, True, (None, None))
a = cur._as_narray()

# ----- the reveal ----
array([(342000.0, 5022000.0), (342000.0, 5023000.0), (343000.0, 5023000.0),
       (343000.0, 5022000.0),  (342000.0, 5022000.0)],
      dtype=[('SHAPE@X', '<f8'), ('SHAPE@Y', '<f8')])

 

We can also do it directly using another arcpy.da interface and it yields the same results perhaps there is a direct link between the cursor's _as_narray and FeatureClassToNumPyArray, details are buried within a *.pyd.  Nonetheless, the journey reaches the same destination.

flds = ['SHAPE@X', 'SHAPE@Y']
arcpy.da.FeatureClassToNumPyArray(in_fc, field_names=flds, explode_to_points=True)

array([(342000.0, 5022000.0), (342000.0, 5023000.0), (343000.0, 5023000.0),
       (343000.0, 5022000.0), (342000.00000000186, 5022000.0)],
      dtype=[('SHAPE@X', '<f8'), ('SHAPE@Y', '<f8')])

 

In its absolute simplest form, the searchcursor simply needs a list of fields and a flag to convert a polygon to points.  The _as_narray method handles conversion to a numpy recarray with a specified dtype (2 floating point numbers) and a shape ( (5,) ) indicating that there are 5 pairs of numbers forming the polygon

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

2.  To an ndarray with just the coordinates

The _xy yields a simpler form...

_xy(in_fc)

    array([[  342000.,  5022000.],
           [  342000.,  5023000.],
           [  343000.,  5023000.],
           [  343000.,  5022000.],
           [  342000.,  5022000.]])
... because it uses arcpy's searchcursor and the _as_narray method to speedily get the geometry (lines 5-8).  At this point, the array's data type is converted (as a 'view') to 64-bit floating point numbers and reshaped to 5 rows of 2 points each.
def _xy(in_fc):
    """Convert featureclass geometry (in_fc) to a simple 2D point array.
    :  See _xyID if you need id values.
    """
    flds = ['SHAPE@X', 'SHAPE@Y']
    args = [in_fc, flds, None, None, True, (None, None)]
    cur = arcpy.da.SearchCursor(*args)
    a = cur._as_narray()
    N = len(a)
    a = a.view(dtype='float64')
    a = a.reshape(N, 2)
    del cur
    return a

 

Working with either form of the returned array has its advantages and disadvantages.  I won't cover those here, but it is worthy to note, that I regularly use both forms of array.

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

3.  Throw in the IDs

Now, we can go back to working with a structured array if we need to maintain different data types in our columns.  This can be accomplished using _xyID which returns the same point values with the addition of the identification number of the polygon that it came with (ie. polygon with objectid 1).  Rather than the cumbersome datatype field descriptions that we got with _xy, I decided to provide simpler names by specifying a dtype whose names were different.  This can be done as long as you don't try to alter the underlying data type (more on this later).

 

def _xyID(in_fc, to_pnts=True):
    """Convert featureclass geometry (in_fc) to a simple 2D structured array
    :  with ID, X, Y values. Optionally convert to points, otherwise centroid.
    """
    flds = ['OID@', 'SHAPE@X', 'SHAPE@Y']
    args = [in_fc, flds, None, None, to_pnts, (None, None)]
    cur = arcpy.da.SearchCursor(*args)
    a = cur._as_narray()
    a.dtype = [('IDs', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')]
    del cur
    return a

# ---- yields ----
array([(1, 342000.0, 5022000.0), (1, 342000.0, 5023000.0), (1, 343000.0, 5023000.0),
       (1, 343000.0, 5022000.0), (1, 342000.0, 5022000.0)],
      dtype=[('IDs', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')])

 

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

4.  How about some attributes

Yes, but what about the attributes.  Lets get fancy and use _ndarray which will pull in all the data from the table as well.

 

def _ndarray(in_fc, to_pnts=True, flds=None, SR=None):
    """Convert featureclass geometry (in_fc) to a structured ndarray including
    :  options to select fields and specify a spatial reference.
    :
    :Requires:
    :--------
    : in_fc - input featureclass
    : to_pnts - True, convert the shape to points. False, centroid returned.
    : flds - '*' for all, others: 'Shape',  ['SHAPE@X', 'SHAPE@Y'], or specify
    """
    if flds is None:
        flds = "*"
    if SR is None:
        desc = arcpy.da.Describe(in_fc)
        SR = desc['spatialReference']
    args = [in_fc, flds, None, SR, to_pnts, (None, None)]
    cur = arcpy.da.SearchCursor(*args)
    a = cur._as_narray()
    del cur
    return a

array([ (1, [342000.0, 5022000.0], 342000.0, 5022000.0, 343000.0, 5023000.0, 4000.0, 1000000.0),
       (1, [342000.0, 5023000.0], 342000.0, 5022000.0, 343000.0, 5023000.0, 4000.0, 1000000.0),
       (1, [343000.0, 5023000.0], 342000.0, 5022000.0, 343000.0, 5023000.0, 4000.0, 1000000.0),
       (1, [343000.0, 5022000.0], 342000.0, 5022000.0, 343000.0, 5023000.0, 4000.0, 1000000.0),
       (1, [342000.0, 5022000.0], 342000.0, 5022000.0, 343000.0, 5023000.0, 4000.0, 1000000.0)],
      dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)), ('EXT_MIN_X', '<f8'),
             ('EXT_MIN_Y', '<f8'), ('EXT_MAX_X', '<f8'), ('EXT_MAX_Y', '<f8'),
             ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])

 

I suppose that you cleverly noticed that the other attributes are replicated during the process of converting the polygon to points.  This can also be accomplished with arcpy.da.FeatureClassToNumPyArray simply by using '*' to use all fields rather than prescribed ones. 

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

5.  Got to have both, but not together

Working with an array which has attributes and point geometry AND the attributes are replicated is a tad of a pain, so it is time to split the attributes from the geometry.

 

def _two_arrays(in_fc, both=True, split=True):
    """Send to a numpy structured/array and split it into a geometry array
    :  and an attribute array.  They can be joined back later if needed.
    """
    a = _xyID(in_fc, to_pnts=True)  # just the geometry
    shp_fld, oid_fld, SR, shp_type = fc_info(in_fc)
    dt_a = [('IDs', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')]  # option 1
    dt_b = [('IDs', '<i4'), ('Xc', '<f8'), ('Yc', '<f8')]
    a.dtype = dt_a
    b = None
    if split:
        ids = np.unique(a['IDs'])
        w = np.where(np.diff(a['IDs']))[0] + 1
        a = np.split(a, w)
        a = np.array([[ids[i], a[i][['Xs', 'Ys']]] for i in range(len(ids))])
    if both:
        b = _ndarray(in_fc, to_pnts=False)
        dt_b.extend(b.dtype.descr[2:])
        b.dtype = dt_b
    return a, b



(array([[1, array([(342000.0, 5022000.0), (342000.0, 5023000.0),
          (343000.0, 5023000.0), (343000.0, 5022000.0),
          (342000.0, 5022000.0)],
        dtype=[('Xs', '<f8'), ('Ys', '<f8')])]], dtype=object),
array([ (1, 342500.0, 5022500.0, 342000.0, 5022000.0, 343000.0, 5023000.0,
            4000.0, 1000000.0)],
        dtype=[('IDs', '<i4'), ('Xc', '<f8'), ('Yc', '<f8'), ('EXT_MIN_X', '<f8'),
                  ('EXT_MIN_Y', '<f8'), ('EXT_MAX_X', '<f8'), ('EXT_MAX_Y', '<f8'),
                  ('Shape_Length', '<f8'), ('Shape_Area', '<f8')]))

This incarnation of getting at the data produces an 'object array' which is an array that can contain records of different size and/or composition.  Why?  Simple!  If you are converting polygons to point objects, then it is highly unlikely that you will have the same number of points per polygon.  You have two options then... collapse and rearrange the structure of the outputs so that all the coordinates are arranged in their respective columns or keep the inherent structure the data.  This is exemplified by the outputs returned by FeatureClassToNumPyArray and the above function.  Lines 24-27 represent the object array which contains an index number then an array of point coordinates with a specified dtype.  Lines 28-32 is the attribute array for that polygon.  As a nice side bonus, the point coordinates have been reduced to the polygon centroid value and not returned as individual points with repetitive attributes as noted previously.  Two for the price of one.

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

6.  What's this new thing

 

Then there is the new SpatialDataFrame and the tie in to Pandas rather than numpy.  It offers some advantages in terms of analysis but in some other areas not so much.  To get the simple polygon out to points you can do the following.

 

from arcgis import SpatialDataFrame as SDF

a = SDF.from_featureclass(in_fc)

p = a.SHAPE.tolist()[0]

p['rings']
[[[342000, 5022000], [342000, 5023000], [343000, 5023000], [343000, 5022000], [342000, 5022000]]]

 

There are many options available to get to the root geometry of your geometry and the ability to work with arcpy, numpy and now Pandas opens a variety of opportunities for improving your analytical arsenal.

 

Next blog will focus on the analysis side and how the options can be used to your advantage.

You have 12 months of data in some raster form... You want some statistical parameter... There are areas of nodata, the extents are all the same... and ... you have a gazillion of these to do.

Sounds like you have a 'cube'... the original 'space-time' cube' .  You can pull space from a time slice... You can slice time through space.  At every location on a 'grid' you have Z as a sequence over time. 

 

Here is the code.   I will use ascii files, but they don't have to be, you just need to prep your files before you use them.

 

Source data originally from .... here .... thanks Xander Bakker.

 

import os
import numpy as np
from textwrap import dedent, indent
import arcpy

arcpy.overwriteOutput = True

ft = {'bool': lambda x: repr(x.astype('int32')),
      'float': '{: 0.3f}'.format}
np.set_printoptions(edgeitems=3, linewidth=80, precision=2, suppress=True,
                    threshold=80, formatter=ft)
np.ma.masked_print_option.set_display('-')  # change to a single -

# ---- Processing temporal ascii files ----
# Header information
# ncols 720
# nrows 360
# xllcorner -180
# yllcorner -90
# cellsize 0.5
# NODATA_Value -9999
# ---- Begin the process ----
#
cols = 720
rows = 360
ll_corner =  arcpy.Point(-180., -90.0)  # to position the bottom left corner
dx_dy = 0.5
nodata = '-9999'
#
# ---- create the basic workspace parameters, create masked arrays ----
#
out_file = r'c:\Data\ascii_samples\avg_yr.tif'
folder = r'C:\Data\ascii_samples'
arcpy.env.workspace = folder
ascii_files = arcpy.ListFiles("*.asc")
a_s = [folder + '\{}'.format(i) for i in ascii_files]
arrays = []
for arr in a_s:
    a = np.mafromtxt(arr, dtype='int', comments='#',
                     delimiter=' ', skip_header=6,
                     missing_values=nodata, usemask=True)
    arrays.append(a)
#
# ---- A sample calculation from the inputs.... calculate the mean ----
#
N = len(arrays)                    # number of months... in this case
arr_shp = (N,) + arrays[0].shape   # we want a (month, col, row) array
msk = arrays[0].mask               # clone the mask... assume they are the same
e = np.zeros(arr_shp, 'int')       # one way is create a zeros array and fill
for i in range(len(arrays)):
    e[i] = arrays[i]
a = np.ma.array(e, mask=e*msk[np.newaxis, :, :])  # the empty array is ready
#
# ---- your need here... ie. Calculate a mean ----
#
avg = a.mean(axis=0)  # calculate the average down through the months
#
# ---- send it out to be viewed in ArcMap or ArcGIS Pro ----
#
value_to_nodata = int(avg.get_fill_value())
out = avg.view(np.ndarray, fill_value=value_to_nodata)
g = arcpy.NumPyArrayToRaster(out, ll_corner, dx_dy, dx_dy)
g.save(out_file)

So the basic process is simple.  I have coded this verbosely and used input parameters read manually from the ascii header since it is the quickest way.... and you would probably know what they are from the start.

 

So in this example... 12 months of some variable, averaged accounting for the nodata cells.  Do the map stuff, define its projection... project it, give it some symbology and move on.

I will leave that for those that make maps.

Modify to suit... maybe I will get this into a toolbox someday

 

 

NOTE.....

 

Now in the linked example, there was a need to simply convert those to rasters from the input format.  In that case you would simply consolidate the salient portions of the script as follows and create the output rasters within the masked array creation loop

......
for arr in a_s:
    a = np.mafromtxt(arr, dtype='int', comments='#',
                     delimiter=' ', skip_header=6,
                     missing_values=nodata, usemask=True)
    value_to_nodata = int(a.get_fill_value())
    out = a.view(np.ndarray, fill_value=value_to_nodata)
    r = arcpy.NumPyArrayToRaster(out, ll_corner, dx_dy, dx_dy)
    out_file = arr.replace(".asc", ".tif")
    r.save(out_file)
    del r, a

.....

 

So for either doing statistical calculations for temporal data, or for format conversion... there are options available where arcpy and numpy play nice.