KD trees
Table of contents: (use browser back to return here)
Distance stuff, lots of tree types. Your distribution of Python comes with scipy which has a couple of implementations.
This is just a quick demo of its potential use. And an homage to Canadian university students, for which KD will always have a special meaning as a primary food group. Kraft Dinner for the non-student
Start with some points and you then want to calculate the closest 2 points to form point origin-destination pairs... because it can be done.
Steps.
So this function just requires a point array of x,y pairs, the number of closest points (N), whether you want to do an x,y sort first and finally, whether you want an output table suitable for use in ArcGIS Pro.
From there, you simply use arcpy.NumPyArrayToTable to produce a gdb featureclass table.
You can them use... XY to Line … to produce line segments, connecting the various origins and destinations as you see fit, or just bask in the creation of an... XY event layer.
Note: lines 32 and 41 can use... cKDTree ...in place of... KDTree ..., if you just need speed for Euclidean calculations.
def nn_kdtree(a, N=3, sorted=True, to_tbl=True):
"""Produce the N closest neighbours array with their distances using
scipy.spatial.KDTree as an alternative to einsum.
Parameters:
-----------
a : array
Assumed to be an array of point objects for which `nearest` is needed.
N : integer
Number of neighbors to return. Note: the point counts as 1, so N=3
returns the closest 2 points, plus itself.
sorted : boolean
A nice option to facilitate things. See `xy_sort`. Its mini-version
is included in this function.
References:
-----------
`<https://stackoverflow.com/questions/52366421/how-to-do-n-d-distance-
and-nearest-neighbor-calculations-on-numpy-arrays/52366706#52366706>`_.
`<https://stackoverflow.com/questions/6931209/difference-between-scipy-
spatial-kdtree-and-scipy-spatial-ckdtree/6931317#6931317>`_.
"""
def _xy_sort_(a):
"""mini xy_sort"""
a_view = a.view(a.dtype.descr * a.shape[1])
idx =np.argsort(a_view, axis=0, order=(a_view.dtype.names)).ravel()
a = np.ascontiguousarray(a[idx])
return a, idx
#
def xy_dist_headers(N):
"""Construct headers for the optional table output"""
vals = np.repeat(np.arange(N), 2)
names = ['X_{}', 'Y_{}']*N + ['d_{}']*(N-1)
vals = (np.repeat(np.arange(N), 2)).tolist() + [i for i in range(1, N)]
n = [names[i].format(vals[i]) for i in range(len(vals))]
f = ['<f8']*N*2 + ['<f8']*(N-1)
return list(zip(n,f))
#
from scipy.spatial import cKDTree
#
idx_orig = []
if sorted:
a, idx_orig = _xy_sort_(a)
# ---- query the tree for the N nearest neighbors and their distance
t = cKDTree(a)
dists, indices = t.query(a, N)
if to_tbl:
dt = xy_dist_headers(N) # --- Format a structured array header
xys = a[indices]
new_shp = (xys.shape[0], np.prod(xys.shape[1:]))
xys = xys.reshape(new_shp)
ds = dists[:, 1:] #[d[1:] for d in dists]
arr = np.concatenate((xys, ds), axis=1)
arr = arr.view(dtype=dt).squeeze()
return arr
else:
return np.array(indices), idx_orig
Just a slightly better formatting that you can get with one of my numpy functions... obviating the need for Pandas for table niceness.
id X_0 Y_0 X_1 Y_1 X_2 Y_2 d_1 d_2
--------------------------------------------------------------
000 3.00 98.00 10.00 94.00 23.00 94.00 8.06 20.40
001 10.00 94.00 3.00 98.00 23.00 94.00 8.06 13.00
002 13.00 18.00 19.00 22.00 34.00 16.00 7.21 21.10
003 19.00 22.00 13.00 18.00 34.00 16.00 7.21 16.16
004 23.00 94.00 10.00 94.00 3.00 98.00 13.00 20.40
005 34.00 16.00 19.00 22.00 43.00 1.00 16.16 17.49
006 37.00 64.00 43.00 89.00 56.00 84.00 25.71 27.59
007 43.00 1.00 34.00 16.00 66.00 6.00 17.49 23.54
008 43.00 89.00 56.00 84.00 61.00 87.00 13.93 18.11
009 56.00 84.00 61.00 87.00 43.00 89.00 5.83 13.93
010 61.00 87.00 56.00 84.00 43.00 89.00 5.83 18.11
011 66.00 6.00 76.00 20.00 43.00 1.00 17.20 23.54
012 67.00 41.00 78.00 50.00 76.00 20.00 14.21 22.85
013 76.00 20.00 66.00 6.00 67.00 41.00 17.20 22.85
014 78.00 50.00 67.00 41.00 80.00 67.00 14.21 17.12
015 80.00 67.00 91.00 66.00 78.00 50.00 11.05 17.12
016 82.00 91.00 94.00 95.00 61.00 87.00 12.65 21.38
017 91.00 66.00 80.00 67.00 78.00 50.00 11.05 20.62
018 94.00 95.00 82.00 91.00 91.00 66.00 12.65 29.15
019 96.00 40.00 78.00 50.00 91.00 66.00 20.59 26.48
Behind the scenes, there should be some 'spatial cleanup'. Specifically, if you look at the image you have point pairs connected by a single segment, that is because they are the closest to one another. Rather than duplicating the segment with opposing directions, you can 'prune' the indices and remove those prior to producing the geometry.
There are lots of tools that you can produce/show geometric relationships. Use them to provide answers to your questions. This implementation will appear soon on the code sharing site. I will provide a link soon.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.