Geometry ... Reconstructing Poly Features # 4

Blog Post created by Dan_Patterson Champion on Apr 17, 2019


As before, the inputs

----The polygons that I will be using are shown to the right.
  1. A square, 5 points, first and last duplicates
  2. A lake on an island in a lake...
  3. A multipart with a two different shaped donut holes
  4. The letter 'C'
  5. The letter 'V'
Each part is labelled at the labelpoint rather than the centroid, hence each part gets labelled.






---- An Alternate Geometry Reconstructor ----

---- Poly* to arrays revisited ----


These two scripts help blow poly features into their constituent parts.  Initially a null point (np.nan, np.nan) is used to separate the poly features.  The location of these insertions is retained and a fr_to index list is maintained so that the resultant 2D points array can be reconstructed or used in other calculations.


The outputs are: a_2d, ids, fr_to, id_fr_to which represent the 2D array, the id values of where one poly feature ends, a from-to list is constructed and a final list with the ids prepended is also included.

def _pp_(poly):
    """See poly_pnts for details.  This is for single poly feature conversion
        null_pnt = (np.nan, np.nan)

    sub = []
#    for i, arr in enumerate(poly):
    for arr in poly:
        pnts = [[pt.X, pt.Y] if pt else null_pnt for pt in arr]
        sub.append(np.asarray(pnts)) #append(pnts)
    return np.asarray(sub)

def poly_pnts(in_fc, as_obj=False):
    """Poly features to points while maintaining the location of the points in
    the input features. null points are replaced with their numpy equivalent.

    in_fc : text
        Full path to the featureclass.
    as_obj : boolean
        True returns a list of arrays of the shapes.  The array types may be
        mixed.  False, returns a 2D array of points and an array of indices.

    False, returns a 2D array of points and the location to split
    those points should you need to reconstruct the poly feature.  A `nan`
    point separates the parts of the features within the poly feature, so split
    first. If `as_obj` is True, a list of arrays is returned. 

    Use `polys = fc_shapes(in_fc)` to obtain the poly features separately.

    SR = getSR(in_fc)   
    polys = []
    with arcpy.da.SearchCursor(in_fc, 'SHAPE@', spatial_reference=SR) as cur:
        polys = [row[0] for row in cur]
    id_too = []
    a_2d = []
    for i, p in enumerate(polys):
        r = _pp_(p)                              # calls to _pp_
        id_too.extend([(i, len(k)) for k in r])
        a_2d.extend([j for i in r for j in i])
    a_2d = np.asarray(a_2d)
    id_too = np.array(id_too)
    ids = id_too[:, 0]
    too = np.cumsum(id_too[:, 1])
    frum = np.concatenate(([0], too))
    fr_to = np.array(list(zip(frum, too)))
    id_fr_to = np.array(list(zip(ids, frum, too)))
    if as_obj:
        a_obj = [a_2d[f:t] for f,t in fr_to]  # alternate constructor
        return a_obj, ids, fr_to, id_fr_to
    return a_2d, ids, fr_to, id_fr_to



A sample run

  • Line 1 reads the featureclass, produces the 2D point array and the various indices
  • In Line 2 and 3, I usually calculate the mean of the point cloud accounting for the null points separating the individual poly features.
  • A separate object array can be constructed if you have a need to work with the individual poly features at once, otherwise, you can recreate this using the a array and the fr_to indices as shown in line 4. 
  • Line 5 reconstructs the original polygons so that they can be moved back into ArcGIS Pro  ... more later

Normally you would 'do stuff' between line 4 and 5 and send back the resultant array, but this is just for illustration


a, ids, fr_to, id_fr_to = poly_pnts(in_fc)
m = np.nanmin(a, axis=0)
a_s =- m
a1 = np.asarray([a[f:t] for f,t in fr_to])
p_arr = [_arr_poly_(i, SR) for i in a1]  # ** to reverse np.concatenate(a1)
frmt = """
Polygon ids:   {}
From-to pairs:
Id_from_to array

print(dedent(frmt).format(ids, fr_to, id_fr_to))

Polygon ids:   [0 1 1 2 2 3 4]
From-to pairs:
[[ 0  5]
[ 5 16]
[16 26]
[26 36]
[36 48]
[48 57]
[57 65]]
Id_from_to array
[[ 0  0  5]
[ 1  5 16]
[ 1 16 26]
[ 2 26 36]
[ 2 36 48]
[ 3 48 57]
[ 4 57 65]]

Line 15 shows that poly features 1 and 2 have 2 parts.

Lines 16-23 are the from-to pairs of points needed to reconstruct the arrays to make polygons.

Line 24- just combines the two previous lists.

Reconstructing the arrays to poly features

The code below does this.  A helper function, then the code block that uses a search cursor to reassemble the array to something that arcpy can use.  Sadly you have to go from a numpy array, then create points, which are placed in an arcpy Array and from there the arcpy.Arrays are assembled to form polygon or polyline features.  And Finally!!! A single-part featureclass is created, then a multipart featureclass, like the original data that went into the whole process. 


Complicated? Not really, the real bottleneck is the cursors.  You need them going in (regardless of the shroud placed around them) and going out. (ie __geo_interface__,  _as_narray,  FeatureClassToNumPyArray.

I should point out here that Numpyarraytofeatureclass works with simple geometry to get poly features back, but who works with simple features all the time.


def _arr_poly_(pnts, SR):
    """Single array to polygon, array can be multipart with or without interior
    rings.  Outer rings are ordered clockwise, inner rings (holes) are ordered
    counterclockwise.  For polylines, there is no concept of order
    Splitting is modelled after _nan_split_(arr)

    subs = []
    s = np.isnan(pnts[:, 0])
    if np.any(s):
        w = np.where(s)[0]
        ss = np.split(pnts, w)
        subs = [ss[0]]
        subs.extend(i[1:] for i in ss[1:])
    aa = []
    for sub in subs:
        aa.append([arcpy.Point(*pairs) for pairs in sub])
    poly = arcpy.Polygon(arcpy.Array(aa), SR)
    return poly

def arr_poly_fc(a, p_type='POLYGON', gdb=None, fname=None, sr=None, ids=None):
    """Reform poly features from the list of arrays created by poly_pnts

    a : array or list of arrays
        Some can be object arrays, normally created by ``pnts_arr``
    p_type : string
        Uppercase geometry type
    gdb : text
        Geodatabase name
    fname : text
        Featureclass name
    sr : spatial reference
        name or object
    ids : list/array
        Identifies which feature each input belongs to.  This enables one to
        account for multipart shapes.
    ``_arr_poly_`` is required

    if ids is None:
        ids = np.arange(len(a)).tolist()
    polys = []
    for i in a:
        p = _arr_poly_(i, sr)  # ---- use _arr_poly
    out = list(zip(polys, ids))
    name = gdb + "\\" + fname
    wkspace = arcpy.env.workspace = 'in_memory'
    arcpy.management.CreateFeatureclass(wkspace, fname, p_type,
    arcpy.management.AddField(fname, 'ID_arr', 'LONG')
    with arcpy.da.InsertCursor(fname, ['SHAPE@','ID_arr']) as cur:
        for row in out:
    out_fname = fname + "_mp"
    arcpy.management.Dissolve(fname, out_fname, "ID_arr",
    arcpy.management.CopyFeatures(out_fname, name)
    del cur

Some Results

So just to compare the geometries, I will compare the areas calculated using arcpy and those calculated using the function below.

polys = fc_shapes(in_fc)

areas = [p.area for p in polys]

areas1 = poly_area(a, ids, fr_to)

areas   #   [100.0, 78.0, 155.0, 52.0, 36.0]

areas1  #   array([100.,  78., 155.,  52.,  36.])

Looks good.  The helper function employs my favorite numpy function einsum, to implement the shoelace formula.  A tad overkill for these teeny polygons, but it works blazingly fast for huge point arrays representing real world polygons.

def poly_area(a, ids, fr_to=None):
    """Calculate of a 2D array sliced into sections using an indices of the
    bounds.  ``a`` is created from ``poly_pnts``

    def _e_area(a):
        """mini e_area with a twist, shoelace formula using einsum"""
        x0, y1 = (a.T)[:, 1:]
        x1, y0 = (a.T)[:, :-1]
        e0 = np.einsum('...i,...i->...i', x0, y0)
        e1 = np.einsum('...i,...i->...i', x1, y1)
        return np.nansum((e0-e1)*0.5)
    # ----
    subs = [_e_area(a[f:t]) for f,t in fr_to]
    totals = np.bincount(ids, weights=subs)
    return totals

I could calculate the area and length properties as I construct the arrays, like cursors do so that the property is readily available.  I find if I wanted those properties I would calculate them, and I don't need them floating around unused.


poly_area is only one of my helper functions for calculating poly properties.  These will be assembled at some stage and posted on GitHub and the Code Sharing site.



So why the whole exercise of converting the polygons to arrays in the first place.

Remember, the output is a 2D array of points.  Shifting, rotating, scaling, thinning, anything-ing is done on the whole array at once, not one point at a time.

Also note, that the Polygon and Polyline classes have properties and methods like intersect, union etc.  Simple functions entailing geometry are not shown/available directly from within those classes.  How do you shift a polygon by a finite amount using arcpy? Your homework.


Coming soon

The pesky attributes