# Nested means classification

573
2
05-01-2019 09:40 AM by
Regular Contributor

Hello,

Does anyone know of a way of establishing nested means classification using geoprocessing? I wouldn't care if it were implemented to work against points, polygons or a raster - any of the three would be fine. I can work with ArcGIS Desktop, ArcGIS Pro or QGIS. Of course the classification groups would be written to a field.

Thank you for any help.

Tags (2)
2 Replies by MVP Esteemed Contributor

An interesting question finally

Load up your favorite IDE (like Spyder )

Lines 24, 25 and 26....

• the field containing the data your want to determine the nested means on
• The output field, which will be magically created... for the results.  DO NOT create it ahead of time
• shut down ArcGIS Pro, on the off chance that you have 'touched' something to create file locks

``````import numpy as np
import arcpy

def mean_split(a, minSize=3):
"""split at means"""
def slice_array(a):
m = np.mean(a)
yes = a <= m  # check for a less than the overal mean
a_left, a_rght = a[~yes], a[yes]  # slice the arrays
return m, a_left, a_rght
# ----
m, L, R = slice_array(a)
m0, L, _ = slice_array(L)
m1, _, R = slice_array(R)
means = [m0, m, m1]
while (len(L) > minSize) and (len(R) > minSize):
m0, L, _ = slice_array(L)
m1, _, R = slice_array(R)
means.extend([m0, m1])
return sorted(means)

# ----- Inputs here -----------------------------------------------

tbl = r"C:\Arc_projects\Table_tools\Table_tools.gdb\pnts_2K_normal"
in_fld = "Norm"
out_fld = "New_class"

# ---- Do some work -----------------------------------------------

# (1)  Get the field from the table and make it a simple array
arr = arcpy.da.TableToNumPyArray(tbl, ['OID@', in_fld], "", True, None)
a = arr[in_fld]

# (2)  Set up for the results
out_arr = np.copy(arr)
out_arr.dtype.names = ['OID@', out_fld]

# (3)  Run the mean_split script... note minSize!!! set it smartly or...
means = mean_split(a, minSize=100)
classed = np.digitize(a, bins=means)

# (4)  Send the results to the output array and add it back to arcgis pro
out_arr[out_fld] = classed
arcpy.da.ExtendTable(tbl, 'OID@', out_arr, 'OID@')
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍``````

The work

• Lines 31, 32,  arcpy/numpy play nice and suck in the table using only the objectid and your desired field (see help for more info)
• little 'a' is just the full array without the object id
• Lines 35, 36, set up for the outputs
• Lines 39, 40 ... run mean_split .... note!!!! minSize is the approximate size of each class.  I used 100 because the field had 2000 records.

NOTE

no error checking provided...

Finally

Through arcpy/numpy magic, 'JOIN' the table back to the original

Open Pro

Results example

``````np.mean(a) # ==> 10.56
np.std(a)  # ==>  1.04
means = np.array(means)

means
array([ 8.583,  9.117,  9.736, 10.557, 11.383, 11.978, 12.439])

a.min(), a.max()  # ==>  (6.97, 13.97)

np.histogram(a, bins=means)

(array([100, 256, 587, 577, 249,  92], dtype=int64), array([ 8.583,  9.117,  9.736, 10.557, 11.383, 11.978, 12.439]))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍``````

Samples drawn from a normal distribution.

Variants of my suggestion can obviously modified to suit by MVP Esteemed Contributor

Added this to  Table tools for ArcGIS Pro

for those that prefer a tool interface 