Select rows based on values in other columns

TyceHerrman · ‎01-14-2020

Hi all,

I'm not exactly sure how to ask this question coherently, so I've included some admittedly low-tech dummy data to illustrate what I'm looking to do.

For context, I'm working with a line shapefile and node shapefile representing NYC streets and any point where they intersect, respectively (link to data). The issue I'm trying to overcome is that there is a point in the node shapefile even if there isn't a real intersection in the world due to elevation differences (e.g., an overpass over a highway).

Each line segment has the NODEID the line starts at and ends at as attributes (NODEID_FROM, NODEID_TO). Each line segment also includes the corresponding node elevation at each node (NODELEVEL_FROM, NODELEVEL_TO) as a factor from A-Z with M being ground level. I want to ultimately get rid of any points that don't actually represent an intersection (i.e., the lines cross in 2D, but go over or under one another based on the node level attributes).

To begin, I made buffers around the nodes so that I could then spatially join the intersecting line segments to the buffer with a one to many join. In effect, I tie each nodeID to the street segment approaches. In the newly created feature class, I want to iterate through each NODEID and for every corresponding NODEID_FROM or NODEID_TO, compare the corresponding NODELEVEL_FROM to the NODELEVEL_TO. My assumption is that If there is more than one level factor, it's not a real intersection and represents lines that cross above or below each other. Below is some simplified dummy data of the buffers after the spatial join. In the dummy data, I'm looking for some way to select all the rows where NODEID = 1 because there are multiple node level factors for the corresponding NODEID_FROM and NODEID_TO attributes (both M and Q). It should not select any rows where NODEID = 2 because all the corresponding NODEID_FROM and NODEID_TO values have the same level factor (M).

I've tried thinking this through with using a search cursor or some numpy/pandas solution, but am coming up blank on how to even begin to tackle this. Also open to entirely different ways of solving the issue of throwing out "fake" intersections.

NODEID	NODEID_FROM	NODEID_TO	NODELEVEL_FROM	NODELEVEL_TO
1	1	37	M	M
1	29	1	M	Q
1	85	1	Q	M
1	1	4	Q	M
2	14	2	M	M
2	15	2	Q	M
2	2	33	M	Q
2	2	87	M	M

JoshuaBixby · ‎01-14-2020

Try the following. After it runs, you should have a list (sel_oids) of all OIDs you want to select.

import arcpy
from itertools import groupby

tbl = # path to feature class or table

flds = ["OID@", "NODEID", "NODEID_FROM", "NODEID_TO", "NODELEVEL_FROM", "NODELEVEL_TO"]
sort_fld = "NODEID"
from_flds, to_flds = ["NODEID_FROM", "NODELEVEL_FROM"], ["NODEID_TO", "NODELEVEL_TO"]

sql = "ORDER BY {}".format(sort_fld)
from_idxs = [flds.index(fld) for fld in from_flds]
to_idxs = [flds.index(fld) for fld in to_flds]
sort_idx = flds.index(sort_fld)

sel_oids = list()
with arcpy.da.SearchCursor(tbl, flds, sql_clause=(None, sql)) as cur:
    for key, group in groupby(cur, lambda x: x[flds.index(sort_fld)]):
        lvls = set()
        oids = list()
        for row in group:
            oids.append(row[0])
            lvls.add(
                row[from_idxs[1]]
                if row[sort_idx] == row[from_idxs[0]]
                else row[to_idxs[1]]
            )
        if len(lvls) > 1:
            sel_oids += oids‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

View solution in original post

JoshuaBixby · ‎01-14-2020

Try the following. After it runs, you should have a list (sel_oids) of all OIDs you want to select.

import arcpy
from itertools import groupby

tbl = # path to feature class or table

flds = ["OID@", "NODEID", "NODEID_FROM", "NODEID_TO", "NODELEVEL_FROM", "NODELEVEL_TO"]
sort_fld = "NODEID"
from_flds, to_flds = ["NODEID_FROM", "NODELEVEL_FROM"], ["NODEID_TO", "NODELEVEL_TO"]

sql = "ORDER BY {}".format(sort_fld)
from_idxs = [flds.index(fld) for fld in from_flds]
to_idxs = [flds.index(fld) for fld in to_flds]
sort_idx = flds.index(sort_fld)

sel_oids = list()
with arcpy.da.SearchCursor(tbl, flds, sql_clause=(None, sql)) as cur:
    for key, group in groupby(cur, lambda x: x[flds.index(sort_fld)]):
        lvls = set()
        oids = list()
        for row in group:
            oids.append(row[0])
            lvls.add(
                row[from_idxs[1]]
                if row[sort_idx] == row[from_idxs[0]]
                else row[to_idxs[1]]
            )
        if len(lvls) > 1:
            sel_oids += oids‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍