Start with two simple shapes that have common points, but this time, we are just interested in the points and not the segments.
Shape 'a' is the black outline with the red point identifiers and shape 'b' has the blue outline and the green labels.
There coordinate values are:
a = np.array([[0., 0.], [1., 4.], [4., 3.], [5., 0.],[0., 0.]])
b = np.array([[1., 0.], [1., 4.], [5., 0.], [1., 0.]])
Being the quick people we are, it is obvious that they share points [1., 4.] and [5., 0.] which are pairs
[1, 1] and [3, 2] using zero-based enumeration.
So how would you find out what points are duplicates and where they are using python and numpy. Let's break it down.
Python approach
ids_ = []
out = []
for i, a_ in enumerate(b):
sub = []
for j, b_ in enumerate(a):
chk = (a_ == b_)
sub.append(chk)
if chk.all():
ids_.append((i, j))
out.append(sub)
out = np.array(out)
ids_ = np.array(ids_)
Nothing like 'enumerate' since you can utilize the cycle to get an index as well as a value. Line 6 compares the value in 'a' to that in 'b' and saves the result to a subarray. If they are both equal, then a separate list is appended with the indices from the respective arrays. We are left with two output arrays.... 'out' and 'ids_'
Now, on to NumPy, with a little side note. Normally if you have two arrays of the same shape (ie. the number of rows and columns), you can just use a direct comparison.
Arrays of equal shape
# -- take the first 4 points of array a_ and the 4 points of b and compare
chk0 = np.equal(a[:-1], b)
# which is the same as
chk1 = (a[:-1] == b)
# -- both yield
array([[0, 1],
[1, 1],
[0, 0],
[0, 1]])
#
# now 'where' they are all equal, this occurs is simply
np.where((a[:-1] == b).all(-1))
# or, more simply
np.nonzero((a[:-1] == b).all(-1))
# both yield, just one match
(array([1]),)
Arrays of unequal shape
# -- since a_ has 5 points and b_ has 4, we can do a direct comparison
# b_ has to have another dimension added to it so you can compare 2D elments
# to a 2D array
compare_ = (a == b[:, None]) # -- b, copy all elements and add a newaxis
all_chk_ = compare_.all(-1)
whr = np.nonzero(all_chk_)
Both the Python and NumPy approaches yield the same results.
# compare
array([[[0, 1],
[1, 0],
[0, 0],
[0, 1],
[0, 1]],
[[0, 0],
[1, 1],
[0, 0],
[0, 0],
[0, 0]],
[[0, 1],
[0, 0],
[0, 0],
[1, 1],
[0, 1]],
[[0, 1],
[1, 0],
[0, 0],
[0, 1],
[0, 1]]])
# all_chk_
array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 0]])
# whr -- where the values are located in the inputs
(array([1, 2]), array([1, 3]))
#
# and the values obtained by slicing from the original array
b[whr[0]]
array([[ 1.00, 4.00],
[ 5.00, 0.00]])
a[whr[1]]
array([[ 1.00, 4.00],
[ 5.00, 0.00]])
So the approaches yield the same results but the NumPy approach is substantially faster and can simplify the code notation. For large arrays, NumPy can be substantially faster. Run your own tests.
Create 1000 random points
a1 = np.random.random(size=(1000, 2)) * 10
b1 = np.random.random(size=(1000, 2)) * 10
%%timeit using python approach and NumPy approach for unequal sizes is over 100x faster. This factor varies with array sizes, but it is useful if speed and storage requirements are at a premium.
Add the above to your toolset.
ADDENDUM
When things aren't quite that perfect, there is always a workaround.... see
Really close points - Esri Community
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.