Skip navigation
All People > Dan_Patterson > Py... blog > 2017 > November
2017

Filter

Started this with rolling stats. 

 

Rolling Statistics for grid data... an alternative

 

Once you have a filter... kernel whatever you want to call it, all you need is a moving window of some form and shape.  I won't go into those details since my previous work covers this.  I will just report a few simple ones, and show you some of the possible kernels that you can create.

 

# ---- my input array ----

Input array.......
-shape (1, 10, 10), ndim 3
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 1 1 1 1 1 2 2 2 2 2 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4 
.... 3 3 3 3 3 4 4 4 4 4

From the above, you can see that I kept it simple... 4 blocks of values, the perfect study area.

 

Let's see if we can pick up some lines that divide the raster/array

Sobel horizonal filter...
-shape (1, 3, 3), ndim 3

  . -1 -2 -1 
  .  0  0  0 
  .  1  2  1 


Sobel horizontal result.......
-shape (1, 10, 10), ndim 3, masked array, fill value 999999

.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  8  8  8  8  8  8  8  8 -1 
.... -1  8  8  8  8  8  8  8  8 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1  0  0  0  0  0  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

See those 8's! nailed the horizontal divide.

 

vertical filter...

  . -1  0  1 
  . -2  0  2 
  . -1  0  1 

Sobel vertical result.......
-shape (1, 10, 10), ndim 3, masked array, fill value 999999
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1  0  0  0  4  4  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

 

So easy... vertical nailed as well.

 

One more...

Array...
-shape (1, 3, 3), ndim 3
  . -1 -1  0 
  . -1  0  1 
  .  0  1  1 

Emboss vertical result.......

-shape (1, 10, 10), ndim 3, masked array, fill value 999999
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  4  4  4  6  6  4  4  4 -1 
.... -1  4  4  4  6  6  4  4  4 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1  0  0  0  2  2  0  0  0 -1 
.... -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 

 

How about some filters that I have found.

    n = np.nan
    all_f = [1, 1, 1, 1, 1, 1, 1, 1, 1]
    no_cnt = [1, 1, 1, 1, n, 1, 1, 1, 1]
    cross_f = [1, 0, 1, 0, 0, 0, 1, 0, 1]
    plus_f = [0, 1, 0, 1, 0, 1, 0, 1, 0]
    grad_e = [1, 0, -1, 2, 0, -2, 1, 0, -1]
    grad_n = [-1, -2, -1, 0, 0, 0, 1, 2, 1]
    grad_ne = [0, -1, -2, 1, 0, -1, 2, 1, 0]
    grad_nw = [2, -1, 0, -1, 0, 1, 0, 1, 2]
    grad_s = [1, 2, 1, 0, 0, 0, -1, -2, -1]
    grad_w = [-1, 0, 1, -2, 0, 2, -1, 0, 1]
    lap_33 = [0, -1, 0, -1, 4, -1, 0, -1, 0]
    line_h = [-1, -1, -1, 2, 2, 2, -1, -1, -1]
    line_ld = [2, -1, -1, -1, 2, -1, -1, -1, 2]
    line_rd = [-1, -1, 2, -1, 2, -1, 2, -1, -1]
    line_v = [-1, 0, -1, -1, 2, -1, -1, 2, -1]
    high = [-0.7, -1.0, -0.7, -1.0, 6.8, -1.0, -0.7, -1.0, -0.7]
    sob_hor = [-1, -2, -1, 0, 0, 0, 1, 2, 1]   # sobel y
    sob_vert = [-1, 0, 1, -2, 0, 2, -1, 0, 1]  # sobel x
    emboss = [-1, -1, 0, -1, 0, 1, 0, 1, 1]
    sharp1 = [0., -0.25, 0., -0.25, 2.0, -0.25, 0., -0.25, 0.]
    sharp2 = [-0.25, -0.25, -0.25, -0.25, 3.0, -0.25, -0.25, -0.25, -0.25]
    sharp3 = [-1, -1, -1, -1, 9, -1, -1, -1, -1]
    lowpass = [1, 2, 1, 2, 4, 2, 1, 2, 1]

    # ---- assemble the dictionary ----

    d = {1: all_f, 2: no_cnt, 3: cross_f, 4: plus_f, 5: grad_e,
         6: grad_n, 7: grad_ne, 8: grad_nw, 9: grad_s, 10: grad_w,
         11: lap_33, 12: line_h, 13: line_ld, 14: line_rd, 15: line_v,
         16: high, 17: sob_hor, 18: sob_vert, 19: emboss, 20: sharp1,
         21: sharp2, 23: sharp3, 24: lowpass}

    filter_ = np.array(d[mode]).reshape(3, 3)

You call the filter using the mode number (ie d[23] is sharp3.

import numpy as np
from numpy.lib.stride_tricks import as_strided

def stride(a, win=(3, 3), stepby=(1, 1)):
    """Provide a 2D sliding/moving view of an array.
    :  There is no edge correction for outputs. Use _pad first.
    """

    err = """Array shape, window and/or step size error.
    Use win=(3,3) with stepby=(1,1) for 2D array
    or win=(1,3,3) with stepby=(1,1,1) for 3D
    ----    a.ndim != len(win) != len(stepby) ----
    """

    assert (a.ndim == len(win)) and (len(win) == len(stepby)), err
    shape = np.array(a.shape)  # array shape (r, c) or (d, r, c)
    win_shp = np.array(win)    # window      (3, 3) or (1, 3, 3)
    ss = np.array(stepby)      # step by     (1, 1) or (1, 1, 1)
    newshape = tuple(((shape - win_shp) // ss) + 1)
    newshape += tuple(win_shp)
    newstrides = tuple(np.array(a.strides) * ss) + a.strides
    a_s = as_strided(a, shape=newshape, strides=newstrides).squeeze()
    return a_s


def a_filter(a, mode=1, ignore_nodata=True, nodata=None):
    """put the filter stuff below this line"""
    #
    # ---- stride the input array ----
    a_strided = stride(a)
    if ignore_nodata:
        c = (np.sum(a_strided * filter_, axis=(2, 3)))
    else:
        c = (np.nansum(a_strided * filter_, axis=(2, 3)))
    pad_ = nodata
    if nodata is None:
        if c.dtype.name in ('int', 'int32', 'int64'):
            pad_ = min([0, -1, a.min()-1])
        else:
            pad_ = min([0.0, -1.0, a.min()-1])
    c = np.lib.pad(c, (1, 1), "constant", constant_values=(pad_, pad_))
    m = np.where(c == pad_, 1, 0)
    c = np.ma.array(c, mask=m, fill_value=None)
    return filter_, c

# ----------------------------------------------------------------------
# __main__ .... code section
if __name__ == "__main__":
    """Optionally...
    : - print the script source name.
    : - run the _demo
    """

#    you can put your stuff here... use RasterToNumPyArray to get an array

 

So that is pretty well some other stuff.

I should point out that the same process can be used for statistical calculations, terrain derivatives (slope, aspect, hillshade, curvature etc ) and I have reported on these before.

 

Now I should point out that none of the above requires the Spatial Analyst extension and the RasterToNumPyArray and NumPyArrayToRaster are provided with arcpy.da 's module.  Lots of possibilities, lots of capability is built-in.

 

More later.

Rolling stats...  The 3x3 moving/sliding/rolling window.. This kernel type will focus on returning statistics like your FocalStatistics functions... but without needing the Spatial Analyst extension...

 

They are used ....

And there are other uses....

 

First the results.

a      # ---- A grid/raster/array ----

array([[5, 1, 2, 0, 5, 4, 7, 3, 2, 2],
       [2, 5, 9, 9, 6, 9, 2, 1, 3, 0],
       [7, 2, 7, 4, 4, 4, 3, 7, 7, 7],
       [3, 9, 6, 8, 0, 1, 8, 5, 0, 7],
       [8, 4, 6, 8, 1, 9, 6, 9, 5, 5],
       [3, 9, 9, 5, 3, 1, 5, 5, 2, 3],
       [0, 5, 2, 0, 9, 4, 7, 9, 9, 0],
       [2, 5, 5, 4, 3, 7, 7, 8, 3, 5],
       [5, 2, 4, 5, 3, 6, 5, 2, 7, 2],
       [4, 6, 4, 9, 8, 6, 8, 6, 2, 0]])


a_s = stride(a, r_c=(3,3))  # ---- create an array of rolling windows 3x3 ----

# ---- Now calculate all the statistics --------------------------------------

rolling_stats(a_s, no_null=False, prn=True)
Min...
[[1 0 0 0 2 1 1 0]
[2 2 0 0 0 1 0 0]
[2 2 0 0 0 1 0 0]
[3 4 0 0 0 1 0 0]
[0 0 0 0 1 1 2 0]
[0 0 0 0 1 1 2 0]
[0 0 0 0 3 2 2 0]
[2 2 3 3 3 2 2 0]]
Max...
[[9 9 9 9 9 9 7 7]
[9 9 9 9 9 9 8 7]
[9 9 8 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[9 9 9 9 9 9 9 9]
[5 5 9 9 9 9 9 9]
[6 9 9 9 8 8 8 8]]
Mean...
[[ 4.44  4.33  5.11  5.    4.89  4.44  3.89  3.56]
[ 5.56  6.56  5.89  5.    4.11  4.44  4.    4.11]
[ 5.78  6.    4.89  4.33  4.    5.78  5.56  5.78]
[ 6.33  7.11  5.11  4.    3.78  5.44  5.    4.56]
[ 5.11  5.33  4.78  4.44  5.    6.11  6.33  5.22]
[ 4.44  4.89  4.44  4.    5.11  5.89  6.11  4.89]
[ 3.33  3.56  3.89  4.56  5.67  6.11  6.33  5.  ]
[ 4.11  4.89  5.    5.67  5.89  6.11  5.33  3.89]]
Med...
[[ 5.  4.  5.  4.  4.  4.  3.  3.]
[ 6.  7.  6.  4.  4.  4.  3.  5.]
[ 6.  6.  6.  4.  4.  6.  6.  7.]
[ 6.  8.  6.  3.  3.  5.  5.  5.]
[ 5.  5.  5.  4.  5.  6.  6.  5.]
[ 5.  5.  4.  4.  5.  7.  7.  5.]
[ 4.  4.  4.  4.  6.  7.  7.  5.]
[ 4.  5.  4.  6.  6.  6.  6.  3.]]
Sum...
[[40 39 46 45 44 40 35 32]
[50 59 53 45 37 40 36 37]
[52 54 44 39 36 52 50 52]
[57 64 46 36 34 49 45 41]
[46 48 43 40 45 55 57 47]
[40 44 40 36 46 53 55 44]
[30 32 35 41 51 55 57 45]
[37 44 45 51 53 55 48 35]]
Std...
[[ 2.67  3.2   2.85  2.62  2.02  2.5   2.28  2.59]
[ 2.59  2.36  2.73  3.09  2.88  2.83  2.71  2.96]
[ 2.2   2.16  2.73  3.16  2.98  2.62  2.59  2.39]
[ 2.4   1.79  3.    3.37  3.15  2.83  2.58  2.5 ]
[ 3.    2.91  3.26  3.34  2.87  2.56  2.26  3.08]
[ 2.91  2.73  2.83  2.62  2.42  2.28  2.38  3.03]
[ 1.76  1.71  2.33  2.45  1.94  2.02  2.36  3.2 ]
[ 1.29  1.79  2.    2.    1.79  1.73  2.31  2.56]]
Var...
[[  7.14  10.22   8.1    6.89   4.1    6.25   5.21   6.69]
[  6.69   5.58   7.43   9.56   8.32   8.02   7.33   8.77]
[  4.84   4.67   7.43  10.     8.89   6.84   6.69   5.73]
[  5.78   3.21   8.99  11.33   9.95   8.02   6.67   6.25]
[  8.99   8.44  10.62  11.14   8.22   6.54   5.11   9.51]
[  8.47   7.43   8.02   6.89   5.88   5.21   5.65   9.21]
[  3.11   2.91   5.43   6.02   3.78   4.1    5.56  10.22]
[  1.65   3.21   4.     4.     3.21   2.99   5.33   6.54]]
Range

# ---- If you have nodata values, they can be accommodated ------------------

 

Now the code functions

 

import numpy as np
from numpy.lib.stride_tricks import as_strided

def _check(a, r_c, subok=False):
    """Performs the array checks necessary for stride and block.
    : a   - Array or list.
    : r_c - tuple/list/array of rows x cols.
    : subok - from numpy 1.12 added, keep for now
    :Returns:
    :------
    :Attempts will be made to ...
    :  produce a shape at least (1*c).  For a scalar, the
    :  minimum shape will be (1*r) for 1D array or (1*c) for 2D
    :  array if r<c.  Be aware
    """

    if isinstance(r_c, (int, float)):
        r_c = (1, int(r_c))
    r, c = r_c
    if a.ndim == 1:
        a = np.atleast_2d(a)
    r, c = r_c = (min(r, a.shape[0]), min(c, a.shape[1]))
    a = np.array(a, copy=False, subok=subok)
    return a, r, c, tuple(r_c)


def _pad(a, nan_edge=False):
    """Pad a sliding array to allow for stats"""
    if nan_edge:
        a = np.pad(a, pad_width=(1, 2), mode="constant",
                   constant_values=(np.NaN, np.NaN))
    else:
        a = np.pad(a, pad_width=(1, 1), mode="reflect")
    return a


def stride(a, r_c=(3, 3)):
    """Provide a 2D sliding/moving view of an array.
    :  There is no edge correction for outputs.
    :
    :Requires:
    :--------
    : _check(a, r_c) ... Runs the checks on the inputs.
    : a - array or list, usually a 2D array.  Assumes rows is >=1,
    :     it is corrected as is the number of columns.
    : r_c - tuple/list/array of rows x cols.  Attempts  to
    :     produce a shape at least (1*c).  For a scalar, the
    :     minimum shape will be (1*r) for 1D array or 2D
    :     array if r<c.  Be aware
    """

    a, r, c, r_c = _check(a, r_c)
    shape = (a.shape[0] - r + 1, a.shape[1] - c + 1) + r_c
    strides = a.strides * 2
    a_s = (as_strided(a, shape=shape, strides=strides)).squeeze()
    return a_s


# ----------------------------------------------------------------------
# (18) rolling stats .... code section
def rolling_stats(a, no_null=True, prn=True):
    """Statistics on the last two dimensions of an array.
    :Requires
    :--------
    : a - 2D array  **Note, use 'stride' above to obtain rolling stats
    : no_null - boolean, whether to use masked values (nan) or not.
    : prn - boolean, to print the results or return the values.
    :
    :Returns
    :-------
    : The results return an array of 4 dimensions representing the original
    : array size and block size
    : eg.  original = 6x6 array   block = 3x3
    :      breaking the array into 4 chunks
    """

    a = np.asarray(a)
    a = np.atleast_2d(a)
    ax = None
    if a.ndim > 1:
        ax = tuple(np.arange(len(a.shape))[-2:])
    if no_null:
        a_min = a.min(axis=ax)
        a_max = a.max(axis=ax)
        a_mean = a.mean(axis=ax)
        a_med = np.median(a, axis=ax)
        a_sum = a.sum(axis=ax)
        a_std = a.std(axis=ax)
        a_var = a.var(axis=ax)
        a_ptp = a_max - a_min
    else:
        a_min = np.nanmin(a, axis=(ax))
        a_max = np.nanmax(a, axis=(ax))
        a_mean = np.nanmean(a, axis=(ax))
        a_med = np.nanmedian(a, axis=(ax))
        a_sum = np.nansum(a, axis=(ax))
        a_std = np.nanstd(a, axis=(ax))
        a_var = np.nanvar(a, axis=(ax))
        a_ptp = a_max - a_min
    if prn:
        s = ['Min', 'Max', 'Mean', 'Med', 'Sum', 'Std', 'Var', 'Range']
        frmt = "...\n{}\n".join([i for i in s])
        v = [a_min, a_max, a_mean, a_med, a_sum, a_std, a_var, a_ptp]
        args = [indent(str(i), '... ') for i in v]
        print(frmt.format(*args))
    else:
        return a_min, a_max, a_mean, a_med, a_sum, a_std, a_var, a_ptp

The interlude... An example of searchcursors and their array rootsCurses at cursors... pretty well a regular occurrence on this site.  People love them and love to hate them.

They try to nest them, and nest them within 'for' loops and 'with' statements with calls that sound poetic ( ... for row in rows, with while for arc thou ... )

 

There are old cursors and new cursors (the da-less and the da cursors).  Cursors appear in other guises such as the new, and cleverly named, 'arcgis' module (digression # 1 ... really? something else with arcgis in it! Who is in charge of branding).

 

Perhaps cursors are cloaked, in other arcpy and  data access module methods (ie. blank-to-NumPyArray and NumPyArray-to-blank).  Who knows for sure since much is locked in arcgisscripting.pyd.

 

Sadly, we deal in a work of mixed data types.  Our tables contain columns of attributes, organized sequentially by rows.  Sometimes the row order has meaning, sometimes not.   Each column contains one data type in a well-formed data structure.  This is why spreadsheets are purely evil for trying to create and maintain data structure, order and form (you can put anything anywhere).

 

The interlude... An example of searchcursors and their array roots
in_tbl = r"C:\Git_Dan\arraytools\Data\numpy_demos.gdb\sample_10k"

sc = arcpy.da.SearchCursor(in_tbl, "*")             # the plain searchcursor
a = arcpy.da.SearchCursor(in_tbl, "*")._as_narray() # using one of its magic methods
b = arcpy.da.TableToNumPyArray(in_tbl, "*")         # using the cloak

np.all(a == b)  # True           ..... everything is equal .......

sc._dtype           # .... here are the fields and dtype in the table ....
  dtype([('OBJECTID', '<i4'), ('f0', '<i4'), ('County', '<U4'),
         ('Town', '<U12'), ('Facility', '<U16'), ('Time', '<i4'),
         ('Test', '<U24')])

sc.fields       # .... just the field names, no need for arcpy.ListFields(...) .....
  ('OBJECTID', 'f0', 'County', 'Town', 'Facility', 'Time', 'Test')


# ---- Some timing... 10,000 records ----
%timeit arcpy.da.SearchCursor(in_tbl, "*")
153 ms ± 12.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit arcpy.da.SearchCursor(in_tbl, "*")._as_narray()
40.4 ms ± 970 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit arcpy.da.TableToNumPyArray(in_tbl, "*")
52.1 ms ± 9.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

 

If someone can explain why the plain searchcursor is slower than its dressed up (or down?) counterparts, I would love to hear about it .

 

 

Back to the main event

Harkening back to the fields of mathematics, arrays are assemblages of data, in 1, 2, 3 or more dimensions.  If an array of any dimension has a uniform data type, then life is easier from a structural and usage perspective (this is one reason why Remote Sensing is easier than GIS ... bring on the mail).  We need to maintain an index which ties our geometry to our attributes so what goes where, and where is what, doesn't get mixed up (digression # 2... I am sure this isn't what the branders meant by The Science of Where but we can only hope)

 

Nerd Stuff 

Enough with the boring stuff... bring on the code.

Some of this has been looked from a slightly different perspective in 

 

Get to the Points... arcpy, numpy, pandas

We need some data to work with so... how about a square.

    in_fc = r"C:\Your_spaceless_path\Your.gdb\square"

The 'Describe' object

 

The 'describe' object does just that: describes an object, in this case a FeatureClass.
desc = arcpy.da.Describe(in_fc)
In the newer arcpy.da module, the values can be accessed from a dictionary.  A quick way to get the sorted dictionary keys is to use a list comprehension.  If you want the values, then you can obtain them in a similar fashion.
sk = sorted([k for k in desc.keys()])  # sorted keys
kv = [(k, desc[k]) for k in sk]  # key/value pairs
kv = "\n".join(["{!s:<20} {}".format(k, desc[k]) for k in sk])
Some useful keys associated with featureclasses are extracted as follows:
With appropriate snips in the full list

[..., 'MExtent', 'OIDFieldName', 'ZExtent', 'aliasName', 'areaFieldName',
  'baseName', ... 'catalogPath', ... 'dataElementType', 'dataType',
  'datasetType', ... 'extent', 'featureType', 'fields', 'file', ... 'hasM',
  'hasOID', 'hasSpatialIndex', 'hasZ', 'indexes', ... 'lengthFieldName',
  ... 'name', 'path', ... 'shapeFieldName', 'shapeType', 'spatialReference',
  ...]

The Cursor object
A cursor gives access to the geometry and attributes of a featureclass.  It is always recommended to use the spatial reference when creating the cursor.  Historically (fixed?), if it was omitted, geometry discrepancies would arise. This information can easily be obtained from the describe object from the previous section.
SR = desc['spatialReference']  # Get the search cursor object.
flds = "*"
args = [in_fc, flds, None, SR, True, (None, None)]
cur = arcpy.da.SearchCursor(*args)
See what it reveals...
dir(cur)

['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__',
'__esri_toolinfo__', '__exit__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',
'__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'_as_narray', '_dtype', 'fields', 'next', 'reset']
 
Individual properties are:
cur.__class__
<class 'da.SearchCursor'>

cur.__class__.__mro__
(<class 'da.SearchCursor'>, <class 'object'>)
The search cursor inherits from 'object' and the difference in their properties and methods offered by the da.SearchCursor can be determined as follows:
s0 = set(dir(mro[0]))
s1 = set(dir(mro[1]))

sorted(list(set.difference(s0, s1)))

 ['__enter__', '__esri_toolinfo__', '__exit__', '__getitem__', '__iter__',
  '__next__', '_as_narray', '_dtype', 'fields', 'next', 'reset']
 
The cursor offers the means to process its objects.
cur.__esri_toolinfo__
 ['FeatureLayer|Table|TableView|Dataset|FeatureDataset::::', 'String::*::',
  'Python::None::', 'CoordinateSystem::::']
 
This handy one returns a numpy structured/recarray
type(cur._as_narray())
 <class 'numpy.ndarray'>
How about information on the attributes of in_fc (more about this later)
cur._as_narray().__array_interface__

 {'version': 3,
  'strides': None,
  'shape': (0,),
  'typestr': '|V36',
  'descr': [('OBJECTID', '<i4'),
            ('Shape', '<f8', (2,)),
            ('Shape_Length', '<f8'),
            ('Shape_Area', '<f8')],
  'data': (2044504703824, False)}
 
We got a glimpse of the field names and data types from the __array_interface__, but this information can be accessed directly as well.
cur.fields
 ('OBJECTID', 'Shape', 'Shape_Length', 'Shape_Area')

cur._dtype
dtype([('OBJECTID', '<i4'),
       ('Shape', '<f8', (2,)),
       ('Shape_Length', '<f8'),
       ('Shape_Area', '<f8')])    
Now, the gotcha's.  We created our search cursor at the beginning and each record was cycled through until it reached the end.  If we attempt to get its properties we may in for a surprise, so we need to 'reset' the cursor back to the start.
cur._as_narray()  # try to get its properties, all we get is the dtype
array([],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
     ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
Once the cursor is reset, the array values for the square are revealed with the appropriate data type.
cur.reset()
cur._as_narray()  # reset to the beginning  

array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
           ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
    
There is no automatic reset, so be careful.  You can print the objects in the array in a couple of ways.
cur.reset()
for row in cur:
   print(("{} "*len(row)).format(*row))  # print individual elements

  1 (342000.0, 5022000.0) 4000.0 1000000.0
 1 (342000.0, 5023000.0) 4000.0 1000000.0
 1 (343000.0, 5023000.0) 4000.0 1000000.0
 1 (343000.0, 5022000.0) 4000.0 1000000.0
 1 (342000.0, 5022000.0) 4000.0 1000000.0
Resetting the cursor, and print again.
cur.reset()
for row in cur:
    print(row)  # print the whole row as a tuple

   (1, (342000.0, 5022000.0), 4000.0, 1000000.0)
   (1, (342000.0, 5023000.0), 4000.0, 1000000.0)
   (1, (343000.0, 5023000.0), 4000.0, 1000000.0)
   (1, (343000.0, 5022000.0), 4000.0, 1000000.0)
   (1, (342000.0, 5022000.0), 4000.0, 1000000.0)
Of course since generator-like objects can be converted to a list, that can be done as an alternative, particularly if you have the memory and wish to deal with list objects instead.
cur.reset()
list(cur)
 [(1, (342000.0, 5022000.0), 4000.0, 1000000.0),
  (1, (342000.0, 5023000.0), 4000.0, 1000000.0),
  (1, (343000.0, 5023000.0), 4000.0, 1000000.0),
  (1, (343000.0, 5022000.0), 4000.0, 1000000.0),
  (1, (342000.0, 5022000.0), 4000.0, 1000000.0)]
 
So if you know the data type of the components of the cursor, you can go to the ndarray in an indirect fashion.
cur.reset()
dt = cur._dtype
c_lst = list(cur)

np.asarray(c_lst, dtype=dt)

array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5023000.0], 4000.0, 1000000.0),
       (1, [343000.0, 5022000.0], 4000.0, 1000000.0),
       (1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
    dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
           ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
The ndarray can be viewed as a record array.  Since the data type and structure remain the same, a 'view' of the array as a record array (recarray).  Record arrays allow the user to slice the array using conventional array slicing or by object dot notation.  
a = a.view(np.recarray)

a.Shape == a['Shape']  # check to see if slicing equals dot notation

array([[ True,  True],
      [ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True]], dtype=bool)
Or more simply...
np.all(a.Shape == a['Shape'])
True       

a.Shape  # or a['Shape']

array([[  342000.,  5022000.],
       [  342000.,  5023000.],
       [  343000.,  5023000.],
       [  343000.,  5022000.],
       [  342000.,  5022000.]])
 
You can calculate the properties of the objects simply, but in the case of polygons, the duplicate start/end point should be reduced to a singleton.  In the examples, the object's shape is obtained, then the desired property is derived on a column basis.
pnts = a.Shape[:-1]       # get the unique points

cent = pnts.mean(axis=0)  # return the mean by column

cent array([  342500.,  5022500.])
 
With some fancy work, and calling one of my previously defined array functions in the 'arraytools' module, you can do things like determine interpoint distances.
import arraytools as art

art.e_dist(cent, pnts)

array([ 707.11,  707.11,  707.11,  707.11])
Which is correct given the square polygon's shape.
Another example, to demonstrate array functions.  In the case of polygons, it is useful to have the first and last point (ie duplicates) retained to ensure closure of the polygon. 
poly = a.Shape

art.e_leng(poly)  # method to return polygon perimeter/length, total, then by segment

 (4000.0, [array([[ 1000.,  1000.,  1000.,  1000.]])])

art.e_area(poly)

 1000000.0
 
------------------------------------------------------------------------------ 
Working with cursors
--------------------
Cursors can access the columns in the tabular data in a variety of ways.  One of the easiest to follow is to simply refer to the columns by the order in which they appear when they were retrieved.  This is fine if one writes scripts in the same way.  In the example that follows, the list of fields to be used with the cursor operations is defined as:
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']

 

When using the above notation, the position of the fields is used to reference their values.   So you may see code that uses ' for row in cursor '  with row[0] being the feature object id (OID@) and row[3] being the value from an integer field (Int_fld).  If you are like me, anything beyond 2, means you are finger counting remembering the python counting is zero-based.  I now prefer to spend the extra time assigning variable names rather than using positional notation.  You can see this in lines 12-13 below.

in_fc = r'C:\Folder\path_to\A_Geodatabase.gdb\FeatureClass   # or Table

desc = arcpy.Describe(in_fc)
SR = desc.spatialReference
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']
where_clause = None
spatial_reference = SR
explode_to_points = True
sql_clause = (None, None)

results = []
with arcpy.da.SearchCursor(in_tbl, in_flds) as cursor:
    for id, x, y, i_val, f_val, s_val in cursor:
        if id > 10:
            do stuff
        else:
            do other stuff
        results.append(... put the stuff here ...)
return results

-----------------------------------------------------------------------------
arcgisscripting
---------------
arcgisscripting can be located in your ArcGIS Pro distribution once everything is imported (arcgisscripting.__file__).  It is located in the installation path (substitute C:\\ArcPro for your Pro path, everything else is the same)
'C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgisscripting.pyd'
Now importing arcpy, also imports parts of arcgisscripting and it also imports the geoprocessor from
C:\ArcPro\Resources\ArcPy\arcpy\geoprocessing\__init__
which imports _base.py which uses the Geoprocessor class as 'gp'

dir(arcgisscripting)
['ExecuteAbort', 'ExecuteError', 'ExecuteWarning', 'NumPyArrayToRaster', 'Raster', 'RasterToNumPyArray', '__cleanup__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_addTimeInterval', '_analyzeForSD', '_chart', '_convertWebMapToMapDocument', '_createGISServerConnectionFile', '_createGeocodeSDDraft', '_createMapSDDraft', '_createimageservicesddraft', '_getImageEXIFProperties', '_getUTMFromLocation', '_hasLocalFunctionRasterImplementation', '_listDateTimeStringFormats', '_listStyleItems', '_listTimeZones', '_mapping', '_ss', '_wrapLocalFunctionRaster', '_wrapToolRaster', 'arcgis', 'create', 'da', 'getmytoolboxespath', 'getsystemtoolboxespath', 'getsystemtoolboxespaths', 'na']

dir(arcgisscripting.da)
['Describe', 'Domain', 'Editor', 'ExtendTable', 'FeatureClassToNumPyArray', 'InsertCursor', 'ListDomains', 'ListFieldConflictFilters', 'ListReplicas', 'ListSubtypes', 'ListVersions', 'NumPyArrayToFeatureClass', 'NumPyArrayToTable', 'Replica', 'SearchCursor', 'TableToNumPyArray', 'UpdateCursor', 'Version', 'Walk', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_internal_eq', '_internal_sd', '_internal_vb'
References
----------

 

Other discussions

----------

https://community.esri.com/docs/DOC-10416-are-searchcursors-brutally-slow-they-need-not-be

 

----------

More later...

Look

 

Code as verbose as possible.  Based on a generic function that I use.

Rules

  1. Do not have Pro or Map open
  2. Run your code
  3. Open the project when you are done.
  4. Review rules 1-3
  5. You can sort using multiple fields or a single field.  You can reverse sort by just slicing and reversing at the same time... arr_sort = arr_sort[::-1] way quicker than trying to get all the options on a dialog.  This can be done on a column or multiple columns basis.
  6. Still too confusing? .... attach the code to a new script in a toolbox... it doesn't need any parameters, but you have to substitute the paths.  ExtendTable requires a close tie to Pro even if your external python IDE is the editing IDE default when you set up Pro.

ExtendTable will do a permanent join if the script is run from a tool within ArcToolbox, making it a nice alternative to the standard join procedure.  

 

Numpy has functionality within the recfunctions module located at https://github.com/numpy/numpy/blob/master/numpy/lib/recfunctions.py 

These would include

__all__ = [
'append_fields', 'drop_fields', 'find_duplicates',
'get_fieldstructure', 'join_by', 'merge_arrays',
'rec_append_fields', 'rec_drop_fields', 'rec_join',
'recursive_fill_fields', 'rename_fields', 'stack_arrays',
]

So you have options to do your work within or without Arc* depending on what is the easiest in the workflow

 

Less than 1000 words too.

 

Python ArcGIS Pro