Point in Polygon ... Geometry Mysteries

DanPatterson_Retired · ‎02-18-2020

P i p

Point in polygon. There are two basic methods with numerous variants. One or the other in some form forms the foundation of spatial queries.

I have opted to present the winding number approach since it is the one I understood the least. Pictures of polygon boundaries flipping around the point wasn't cutting it for me, nor was the fact that most implementations checked one point at a time to see if it fit into one polygon.

So much for the fantasy of the points automagically finding their polygon home in an instance. I started of with cursors and point geometry and polygon objects and soon realized that there has to be an alternate way. Dr Google was useless. The same references kept coming up. A useful history, but the academic literature provided nothing more than variants on a theme.

I was hoping that you might be able to check a whole bunch of points at once to see if they were in a polygon. Nadda, Zilch. Except for one glimmer of hope found buried on GitHub. I had my inspiration and the wheel would once again be reinvented.

Some sample shapes appear to the right. Two polygons were created in ArcGIS Pro. The upper is their array representation as an svg graphic and the lower two are arcpy polygons similarly represented.

The winding number algorithm in its simplest form is show below.

It is pretty simple. Parse the points into those that are obviously beyond the extent of the bounds of the polygon, then get fussy and check every side of a polygon edge to see if it is enclosed in the bounds of the polygon.

The sided-ness of a point relative to a polygon edge is handled by _is_right_side.

Practically, the y value of a point is inspected to see if it is between two points on a segment (line 18-19). If it is, then the sided-ness check is made and if the point's y is on the right side, then the w counter is incremented. Failing that test, it is decrease. If neither condition applies, nothing happens. In the end any nonzero locations are within the polygon, the rest outside. I should note here, that some implementations provide nonzero initial values, so that the algorithm looks for zero values instead. The nice thing about the winding number vs the crossing number approach is that the former only involves only side checks and addition and multiplication, the latter has a division requiring a denominator check for 0.

Of course there are the perennial questions of "what if it is on the boundary?"... "what about holes?"... "what if I walk counterclockwise?" Interesting questions, but none really of interest to me. I couldn't get past the fact that apparently the inclusion test was supposed to be done one point at a time.

So it is heartwarming to see that all the points are returning the same points in those 2 highly irregular polygons. My preference is for the winding number since no segment intersection is required, hence no division. Also, winding number is much easier to vectorize (see np_wn below) since the last thing I want to do is test each point against each line segment.

Winding number for point in polygon inclusion tests.

def winding_num(pnts, poly):
    """Point in polygon using winding numbers.

    Parameters
    ----------
    p : array
        The (x, y) values of the point in question.
    poly : array
        A clockwise oriented Nx2 array of points, 
        with the first and last points being equal.
    """
    def cal_w(p, poly):
        """Do the calculation"""
        w = 0
        y = p[1]
        ys = poly[:, 1]
        for i in range(poly.shape[0]):
            if ys[i-1] <= y:
                if ys[i] > y:
                    if _is_right_side(p, poly[i-1], poly[i]) > 0:
                        w += 1
            elif ys[i] <= y:
                if _is_right_side(p, poly[i-1], poly[i]) < 0:
                    w -= 1
        return w
    w = [cal_w(p, poly) for p in pnts]
    return pnts[np.nonzero(w)]‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

def _is_right_side(p, strt, end):
    """Determine if a point (p) is `inside` a line segment (strt-->end).
    See : line_crosses, in_out_crosses in npg_helpers.
    position = sign((Bx - Ax) * (Y - Ay) - (By - Ay) * (X - Ax))

    negative for right of clockwise line, positive for left. So in essence,
    the reverse of _is_left_side with the outcomes reversed 
    """
    x, y, x0, y0, x1, y1 = *p, *strt, *end
    return (x1 - x0) * (y - y0) - (y1 - y0) * (x - x0)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

def crossing_num(pnts, poly, line=True):
    """Crossing Number for point(s) in polygon.

    Parameters
    ----------
    pnts : array of points
        Points are an N-2 array of point objects determined to be within the
        extent of the input polygons.
    poly : polygon array
        Polygon is an Nx2 array of point objects that form the clockwise
        boundary of the polygon.
    line : boolean
        True to include points that fall on a line as being inside.
    """
    def _in_ex_(pnts, ext):
        """Return the points within an extent or on the line of the extent."""
        LB, RT = ext
        comp = np.logical_and(LB <= pnts, pnts <= RT)  # using <= and <=
        idx = np.logical_and(comp[..., 0], comp[..., 1])
        return idx, pnts[idx]

    pnts = np.atleast_2d(pnts)
    xs = poly[:, 0]
    ys = poly[:, 1]
    N = len(poly)
    xy_diff = np.diff(poly, axis=0)
    dx = xy_diff[:, 0]  # np.diff(xs)
    dy = xy_diff[:, 1]  # np.diff(ys)
    ext = np.array([poly.min(axis=0), poly.max(axis=0)])
    idx, inside = _in_ex_(pnts, ext)
    is_in = []
    for pnt in inside:
        cn = 0   # the crossing number counter
        x, y = pnt
        for i in range(N - 1):
            if line is True:
                c0 = (ys[i] < y <= ys[i + 1])  # changed to <= <=
                c1 = (ys[i] > y >= ys[i + 1])  # and >= >=
            else:
                c0 = (ys[i] < y < ys[i + 1])
                c1 = (ys[i] > y > ys[i + 1])
            if (c0 or c1):  # or y in (ys, ys[i+1]):
                vt = (y - ys[i]) / dy[i]  # compute x-coordinate
                if line is True:
                    if (x == xs[i]) or (x < (xs[i] + vt * dx[i])):  # include
                        cn += 1
                else:
                    if x < (xs[i] + vt * dx[i]):  # exclude pnts on line
                        cn += 1
        is_in.append(cn % 2)  # either even or odd (0, 1)
    return inside[np.nonzero(is_in)]
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

The numpy incarnation of winding number simply compares all the points to all sides rather than one point to one side a time.

def np_wn(pnts, poly, return_winding=False):
    """Return points in polygon using a winding number algorithm in numpy.

    Parameters
    ----------
    pnts : Nx2 array
        Points represented as an x,y array.
    poly : Nx2 array
        Polygon consisting of at least 4 points oriented in a clockwise manner.
    return_winding : boolean
        True, returns the winding number pattern for testing purposes.  Keep as
        False to avoid downstream errors.

    Returns
    -------
    The points within or on the boundary of the geometry.

    References
    ----------
    `<https://github.com/congma/polygon-inclusion/blob/master/
    polygon_inclusion.py>`_.  inspiration for this numpy version
    """
    x0, y0 = poly[:-1].T  # polygon `from` coordinates
    x1, y1 = poly[1:].T   # polygon `to` coordinates
    x, y = pnts.T         # point coordinates
    y_y0 = y[:, None] - y0
    x_x0 = x[:, None] - x0
    diff_ = (x1 - x0) * y_y0 - (y1 - y0) * x_x0  # diff => einsum in original
    chk1 = (y_y0 >= 0.0)
    chk2 = np.less(y[:, None], y1)  # pnts[:, 1][:, None], poly[1:, 1])
    chk3 = np.sign(diff_).astype(np.int)
    pos = (chk1 & chk2 & (chk3 > 0)).sum(axis=1, dtype=int)
    neg = (~chk1 & ~chk2 & (chk3 < 0)).sum(axis=1, dtype=int)
    wn = pos - neg
    out_ = pnts[np.nonzero(wn)]
    if return_winding:
        return out_, wn
    return out_
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

For those that are interested in time testing... read on. To the rest of you... until next time.

Testing section

Creating arcpy Points and Polygons from x,y coordinates

Making the points and polygons from numpy array data is fairly simple. Arcpy's object model forces you to create point objects, then throw them into an Array object, from which you can form a Polygon object. Sadly, there is no shortcut to this even if you don't ever need the properties and methods associated with the objects used in the construction path from numbers to polygon.

# ---- g_uni is an Nx2 array of x,y coordinates
# ---- e0, e1 are polygons.  
# A closed-loop, clockwise oriented sequence of x,y coordinates.

import arcpy
pnts = [arcpy.Point(i[0], i[1]) for i in g_uni]
poly0 = arcpy.Polygon(arcpy.Array(arcpy.Point(i[0], i[1]) for i in e0))
poly1 = arcpy.Polygon(arcpy.Array(arcpy.Point(i[0], i[1]) for i in e1))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Simple tests for inclusion

Arcpy

Not bad... 100 points against 73 and 22 edges respectively.

# ---- e0 test  73 edges
in_0 = [p.within(poly0) for p in pnts]
n0 = len(in_0)
[pnts[i] for i in range(n0) if in_0[i]]

[<Point (9.0, 13.0, #, #)>,
 <Point (11.0, 12.0, #, #)>,
 <Point (12.0, 14.0, #, #)>,
 <Point (13.0, 12.0, #, #)>]

# ---- e1 test  22 edges
in_1 = [p.within(poly1) for p in pnts]
n1 = len(in_1)
[pnts[i] for i in range(n1) if in_1[i]]

[<Point (20.0, 1.0, #, #)>]

# ---- timing
e0 : 2.27 ms ± 12.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
e1 : 1.25 ms ± 40.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Python

A valiant effort, hardly a sip of coffee. I wonder how long it will take for a much larger dataset.

# ---- pure python timing
winding_num(g_uni, e0)
array([[ 9.00, 13.00],
       [ 11.00, 12.00],
       [ 12.00, 14.00],
       [ 13.00, 12.00]])

winding_num(g_uni, e0)
array([[ 20.00, 1.00],
       [ 21.00, 0.00]])

e0 : 3.55 ms ± 239 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
e1 : 1.55 ms ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Numpy

Where my interests lie. Can this whole thing be vectorized? A lot faster it appears.

np_wn(g_uni, e0)
array([[ 9.00,  13.00],
       [ 11.00,  12.00],
       [ 12.00,  14.00],
       [ 13.00,  12.00]])

np_wn(g_uni, e1)
array([[ 20.00,  1.00],
       [ 21.00,  0.00]])

e0 : 116 µs ± 4.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
e1 : 62.9 µs ± 2.92 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Arcpy Multipoint

No slouch, especially when you have to consider the extra baggage a multipoint carries.

mp = arcpy.Multipoint(arcpy.Array(pnts))

int_0 = mp.intersect(poly0, 1)  # return points as the output
[p for p in int_0]
[<Point (9.0, 13.0, #, #)>, <Point (11.0, 12.0, #, #)>,
 <Point (12.0, 14.0, #, #)>, <Point (13.0, 12.0, #, #)>]

mp = arcpy.Multipoint(arcpy.Array(pnts))
int_1 = mp.intersect(poly1, 1)
[p for p in int_1]
[<Point (20.0, 1.0, #, #)>, <Point (21.0, 0.0, #, #)>]


e0 : 1.35 ms ± 35.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
e1 : 862 µs ± 42.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍