Select to view content in your preferred language

Common points

271
1
Thursday
Labels (1)
DanPatterson
MVP Esteemed Contributor
1 1 271

Start with two simple shapes that have common points, but this time, we are just interested in the points and not the segments.

comparison_1.pngShape 'a' is the black outline with the red point identifiers and shape 'b' has the blue outline and the green labels.

There coordinate values are:

a = np.array([[0., 0.], [1., 4.], [4., 3.], [5., 0.],[0., 0.]])
b = np.array([[1., 0.], [1., 4.], [5., 0.], [1., 0.]])

Being the quick people we are, it is obvious that they share points [1., 4.] and [5., 0.] which are pairs

[1, 1] and [3, 2] using zero-based enumeration.

 

So how would you find out what points are duplicates and where they are using python and numpy.  Let's break it down.

Python approach

ids_ = []
out = []
for i, a_ in enumerate(b):
    sub = []
    for j, b_ in enumerate(a):
        chk = (a_ == b_)
        sub.append(chk)
        if chk.all():
            ids_.append((i, j))
    out.append(sub)
out = np.array(out)
ids_ = np.array(ids_)

 

Nothing like 'enumerate' since you can utilize the cycle to get an index as well as a value.  Line 6 compares the value in 'a' to that in 'b' and saves the result to a subarray.  If they are both equal, then a separate list is appended with the indices from the respective arrays.  We are left with two output arrays.... 'out' and 'ids_'

Now, on to NumPy, with a little side note.  Normally if you have two arrays of the same shape (ie. the number of rows and columns), you can just use a direct comparison.

Arrays of equal shape

# -- take the first 4 points of array a_ and the 4 points of b and compare
chk0 = np.equal(a[:-1], b)
# which is the same as
chk1 = (a[:-1] ==  b)
# -- both yield
array([[0, 1],
       [1, 1],
       [0, 0],
       [0, 1]])
#
# now 'where' they are all equal, this occurs is simply
np.where((a[:-1] ==  b).all(-1))
# or, more simply
np.nonzero((a[:-1] ==  b).all(-1))
# both yield, just one match
(array([1]),)

 

Arrays of unequal shape

# -- since a_ has 5 points and b_ has 4, we can do a direct comparison
#   b_ has to have another dimension added to it so you can compare 2D elments
#   to a 2D array

compare_ = (a == b[:, None])  # -- b, copy all elements and add a newaxis
all_chk_ = compare_.all(-1)
whr = np.nonzero(all_chk_)

 

Both the Python and NumPy approaches yield the same results.

# compare 
array([[[0, 1],
        [1, 0],
        [0, 0],
        [0, 1],
        [0, 1]],

       [[0, 0],
        [1, 1],
        [0, 0],
        [0, 0],
        [0, 0]],

       [[0, 1],
        [0, 0],
        [0, 0],
        [1, 1],
        [0, 1]],

       [[0, 1],
        [1, 0],
        [0, 0],
        [0, 1],
        [0, 1]]])
# all_chk_ 
array([[0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0]])
# whr   -- where the values are located in the inputs
(array([1, 2]), array([1, 3]))
#
# and the values obtained by slicing from the original array
b[whr[0]]
array([[  1.00,   4.00],
       [  5.00,   0.00]])

a[whr[1]]
array([[  1.00,   4.00],
       [  5.00,   0.00]])

 

So the approaches yield the same results but the NumPy approach is substantially faster and can simplify the code notation.  For large arrays, NumPy can be substantially faster.  Run your own tests.

Create 1000 random points

a1 = np.random.random(size=(1000, 2)) * 10
b1 = np.random.random(size=(1000, 2)) * 10

%%timeit using python approach and NumPy approach for unequal sizes is over 100x faster.  This factor varies with array sizes, but it is useful if speed and storage requirements are at a premium.

Add the above to your toolset.

ADDENDUM

When things aren't quite that perfect, there is always a workaround.... see

Really close points - Esri Community

 

Tags (3)
1 Comment
VenkataKondepati
Occasional Contributor

Wow. Great thoughts and very interesting research to save time, storage, and even compute. Thank you for sharing. 

Contributors
About the Author
Retired Geomatics Instructor (also DanPatterson_Retired). Currently working on geometry projects (various) as they relate to GIS and spatial analysis. I use NumPy, python and kin and interface with ArcGIS Pro.
Labels