Getting a union of a list of point geometries (or getting list of unique locations) via ArcPy

Glasnoct · ‎10-04-2024

I've been trying to figure out a way to de-duplicate a list of point geometries so that any coincidences in the list are removed and I end up with a list of just the unique locations (unique being if they are outside of 1' of another point on the list). It looks like I can union individual points to each other but that would only work if they were coincident, no? My initial idea was to construct a multipoint geometry object from the list of point geometries and union it to itself to remove duplicates but that doesn't seem to work. I think this problem must be deceptively simple and I'm just having tunnel vision as to what else to try.

To put it more simply, I need to detect coincident vertices in a polyline and return the location of anywhere that two or more points are coincident. I currently can do all of this except only return one location for each coincident area. Currently I am returning each instance of a coincident location (3 coincident points results in 3 returned coincident locations)

Glasnoct · ‎10-08-2024

Ok, this should work (or at least be 90% there since I haven't run it to test)

import itertools
import arcpy


def return_duplicate_vertice_locations(polylist: list[arcpy.Polyline],
                                       coincident_tolerance: int = 1) -> (list[arcpy.PointGeometry], list[arcpy.Polyline]):
    """
    Takes a list of Polyline objects and returns a list of PointGeometries at locations where multiple polyline vertices
    were detected as well as duplicate of input list with duplicates merged
    :param polylist: list of polyline geometries
    :param coincident_tolerance: distance that two points will be considered coincident. uses polyline spatial reference units. default of 1
    :return: list of coincident vertice locations, polylist with cleaned up duplicate vertices
    """

    coincident_locations = []
    new_polylist = []
    for polyline in polylist:
        sr = polyline.spatialReference
        deduped_points = []
        segment_list = []
        pairs = itertools.pairwise(polyline[0])  # generates list of point pairs e.g. (0, 1), (1, 2), (2, 3) etc.
        for point_pair in pairs:
            segment = arcpy.Polyline(arcpy.Array(point_pair), spatial_reference=sr)
            if segment.length == 0.0:
                # create a tiny dummy line smaller than the tolerance distance just so the geometry doesn't return as none
                first_point = arcpy.PointGeometry(point_pair[0], spatial_reference=sr)
                second_point = first_point.pointFromAngleAndDistance(0, coincident_tolerance - 0.1).firstPoint
                new_pair = [point_pair[0], second_point]
                segment = arcpy.Polyline(arcpy.Array(new_pair), spatial_reference=sr)
            segment_list.append(segment)

        for i, segment in enumerate(segment_list):
            if segment.length <= coincident_tolerance:
                location = arcpy.PointGeometry(segment.centroid, sr)
                if not coincident_locations or not any(location.distanceTo(x) <= db.coincidence_tolerance for x in coincident_locations):
                    coincident_locations.append(location)
            else:
                if i == len(segment_list):  # if last segment, append beginning and end points
                    deduped_points.extend(segment[0])
                else:  # otherwise append first point only
                    deduped_points.append(segment[0][0])
                    
        new_polylist.append(arcpy.Polyline(arcpy.Array(deduped_points), spatial_reference=sr))
        return coincident_locations, new_polylist

View solution in original post

Clubdebambos · ‎10-04-2024

Hi @Glasnoct

You can use the FindIdentical tool and use SHAPE as the input field.

Here is a workflow for deleting coincident points and having unique points as the output.

~ learn.finaldraftmapping.com

Glasnoct · ‎10-04-2024

This takes a table or feature class. I'm working with a list of solely point geometries as I'm breaking apart polylines and comparing the vertices on a feature by feature basis.

Clubdebambos · ‎10-04-2024

Does the below meet your needs?

import arcpy
from collections import Counter

fc = r"path\tob\fc"

## get a list (set) of unique ids
unique_ids = {row[0] for row in arcpy.da.SearchCursor(fc, "OBJECTID")}

## use the unique ids to iterate over each feature
for unique_id in unique_ids:
    ## get all vertices (x,y)
    all_pts = [row[0] for row in arcpy.da.SearchCursor(fc, "SHAPE@XY", f"OBJECTID = {unique_id}", explode_to_points=True)]

    ## count the amount of times an (x,y) is present (> 1 is a duplicate)
    count_dict = Counter(all_pts)

    ## we only want the duplicates
    duplicates = {key : value for key, value in count_dict.items() if value > 1}

    ## if duplictes found, print them to screen
    if duplicates:
        print(unique_id, duplicates)

~ learn.finaldraftmapping.com

BruceBacia · ‎10-04-2024

We're currently migrating to the UN and are needing to do something similar to delete coincident vertices (or vertices that are so close to one another that the UN treats them as coincident), in order to avoid having tons of dirty areas through which we can't do tracing. We're investigating several possible solutions, one of which may include a Python script.

I haven't started working on it, but my initial thinking was an algorithm something like:

Iterate through each line segment
- Iterate through all vertices of each line segment
  - Gather the XY geometries of all vertices in a list
  - Use the pythagorean theorem to calculate the distance between vertices to identify vertices within a certain distance of one another
  - Pop unwanted vertices out of the vertices list
  - Reconstruct the line segment with only the remaining desired vertices using an update cursor

Obviously, there would be a lot of details to hammer out, but I think this logic could work. We currently have an analyst that is also testing an FME solution with a transformer called Generalizer Generalizer (safe.com). Not sure if that's available to you, but might be something to check out, if it is.

Glasnoct · ‎10-04-2024

I won't be able to get back to this thread until Tuesday but you gave me a good idea on how I might solve this with the segmentation of the cable. I was being tripped up by segments generated from coincident points as the length would be 0 and thus the shape is invalid or something (i cannot call any methods of the geometry object). I got around this by generating a slightly offset temporary point from the first point using pointFromAndleAndDistance and then constructing a polyline object from those two points. I'll come back to this next week and flesh it out but the order of operations is roughly:

for polyline:
segments = create_segments_from_polyline_func

for s in segments:
if s.length <= coincident_tolerance # currently 1' for my needs
get centroid of segment for inserting a new feature at that location (creating a note feature for client)
insertCursor on note FC with shape equal to the centroid

If you wanted to reconstruct the geometry without the duplicate vertices:

if segment length > coincidence_tolerance:
if segment is last segment in list of segments, append first and last point to list

otherwise append first point to list

generate polyline object from list of appended segment points, run update cursor on entry

Glasnoct · ‎10-08-2024

Ok, this should work (or at least be 90% there since I haven't run it to test)

import itertools
import arcpy


def return_duplicate_vertice_locations(polylist: list[arcpy.Polyline],
                                       coincident_tolerance: int = 1) -> (list[arcpy.PointGeometry], list[arcpy.Polyline]):
    """
    Takes a list of Polyline objects and returns a list of PointGeometries at locations where multiple polyline vertices
    were detected as well as duplicate of input list with duplicates merged
    :param polylist: list of polyline geometries
    :param coincident_tolerance: distance that two points will be considered coincident. uses polyline spatial reference units. default of 1
    :return: list of coincident vertice locations, polylist with cleaned up duplicate vertices
    """

    coincident_locations = []
    new_polylist = []
    for polyline in polylist:
        sr = polyline.spatialReference
        deduped_points = []
        segment_list = []
        pairs = itertools.pairwise(polyline[0])  # generates list of point pairs e.g. (0, 1), (1, 2), (2, 3) etc.
        for point_pair in pairs:
            segment = arcpy.Polyline(arcpy.Array(point_pair), spatial_reference=sr)
            if segment.length == 0.0:
                # create a tiny dummy line smaller than the tolerance distance just so the geometry doesn't return as none
                first_point = arcpy.PointGeometry(point_pair[0], spatial_reference=sr)
                second_point = first_point.pointFromAngleAndDistance(0, coincident_tolerance - 0.1).firstPoint
                new_pair = [point_pair[0], second_point]
                segment = arcpy.Polyline(arcpy.Array(new_pair), spatial_reference=sr)
            segment_list.append(segment)

        for i, segment in enumerate(segment_list):
            if segment.length <= coincident_tolerance:
                location = arcpy.PointGeometry(segment.centroid, sr)
                if not coincident_locations or not any(location.distanceTo(x) <= db.coincidence_tolerance for x in coincident_locations):
                    coincident_locations.append(location)
            else:
                if i == len(segment_list):  # if last segment, append beginning and end points
                    deduped_points.extend(segment[0])
                else:  # otherwise append first point only
                    deduped_points.append(segment[0][0])
                    
        new_polylist.append(arcpy.Polyline(arcpy.Array(deduped_points), spatial_reference=sr))
        return coincident_locations, new_polylist

TonyAlmeida · ‎10-04-2024

Try this.

import arcpy

# Set the input point feature class
input_fc = r"your_points"  # Replace with your point feature class

# Spatial reference, adjust based on your data
spatial_ref = arcpy.Describe(input_fc).spatialReference

# Create lists to store unique points and coincident OBJECTIDs
unique_points = []
coincident_objectids = []

# Create a search cursor to iterate over points and track their OBJECTIDs
with arcpy.da.SearchCursor(input_fc, ["OBJECTID", "SHAPE@XY"]) as search_cursor:
    for row in search_cursor:
        objectid = row[0]
        point_geom = arcpy.PointGeometry(arcpy.Point(row[1][0], row[1][1]), spatial_ref)
        is_unique = True

        # Check if the point is within 1 foot of any existing unique points
        for unique_point in unique_points:
            if point_geom.distanceTo(unique_point[1]) <= 1.0:  # 1 foot tolerance
                is_unique = False
                coincident_objectids.append((unique_point[0], objectid))  # Track the coincident OBJECTIDs
                break

        if is_unique:
            unique_points.append((objectid, point_geom))

# Print out all the duplicats/coincident OBJECTIDs
if coincident_objectids:
    print("OBJECTIDs of points that are duplicats/coincident (within 1 foot):")
    for pair in coincident_objectids:
        print(f"OBJECTID 1: {pair[0]}, OBJECTID 2: {pair[1]}")
else:
    print("No duplicates/coincident points found.")

HaydenWelch · ‎10-21-2024

Assuming your duplicates are all sequential, you could use something like this:

import arcpy

def remove_duplicate_points(polyline: arcpy.Polyline, *, tolerance: float = 0.1) -> arcpy.Polyline:
    previous_point = polyline[0][0]
    points = [previous_point]
    for point in polyline[0]:
        if (abs(point.X - previous_point.X) > tolerance) and (abs(point.Y - previous_point.Y) > tolerance):
            points.append(point)
            previous_point = point
    return arcpy.Polyline(arcpy.Array(points), spatial_reference=polyline.spatialReference)

Here's a more verbose version with some tests:

import arcpy
import random

import timeit

from functools import reduce

def remove_duplicate_points(polyline: arcpy.Polyline, *, tolerance: float = 0.1) -> arcpy.Polyline:
    """Remove duplicate points from a polyline.
    Args:
        polyline: A polyline object. or a list of polyline objects.
        tolerance: The distance between points (in feature units) to consider them duplicates. Default is 0.1.
    Returns:
        A polyline object with duplicate points removed. or a list of polyline objects with duplicate points removed.
    """
    if polyline.isMultipart:
        # Recursively call remove_duplicate_points on each part of the polyline
        # Then union the parts to prevent segments being added between parts
        return reduce(
            function=lambda acc, pl: acc.union(pl), 
            sequence=[remove_duplicate_points(part) for part in polyline]
        )
    
    # First point
    previous_point = polyline[0][0]
    
    # Unique points list
    points = [previous_point]
    for point in polyline[0]:
        # Only append the point to the points list if it is further than the tolerance from the previous point
        if (abs(point.X - previous_point.X) > tolerance) and (abs(point.Y - previous_point.Y) > tolerance):
            points.append(point)
            previous_point = point
    
    # Return a new polyline object with the unique points
    return arcpy.Polyline(arcpy.Array(points), spatial_reference=polyline.spatialReference)

def generate_randomized_dupe_polyline() -> arcpy.Polyline:
    """Generate a polyline with duplicate points.
    Returns:
        A polyline object with duplicate points.
    """
    # Generate a random number of points
    r_points = [arcpy.Point(random.randint(0, 10), random.randint(0, 10)) for _ in range(random.randint(3, 100))]
    points = []
    
    for p in r_points:
        # Add the point to the points list
        points.append(p)
        # Add a duplicate point to the points list
        points.append(p)
    return arcpy.Polyline(arcpy.Array(points))

def main():
    runs = 1000
    duration = timeit.timeit(lambda: remove_duplicate_points(generate_randomized_dupe_polyline()), number=runs)
    avg_duration = duration / runs
    print(f"Average time to remove duplicate points from a polyline: {avg_duration:0.5f} seconds per polyline")
    
if __name__ == "__main__":
    main()

This one runs in about 1ms per polyline. But the optimization is that it only sees the last unique point so if a point is duplicated later in the line it will still add it as the previous point is outside the culling tolerance.