Efficient K-Order network neighbors?

SethSpielman · ‎07-27-2011

I would like to identify the Kth order neighbors of an edge on a network, specifically the neighbors of a large set of streets. For example, I have a street that I'm interested in looking at, call this the focal street. For each focal street I want to find the streets that share an intersection, these are the first order neighbors. Then for each of those streets that share an intersection with the focal street I would like to find their neighbors (these would be the second order neighbors), and so on...

Calculating the first order neighbors using ArcGIS' geoprocessing library (arcpy) took 6+ hours, second order neighbors has been running for 20+ hours. Needless to say I want to find a more efficient solution. I have created a python dictionary which is keyed on each street and contains the connected streets as values. For example:

st2neighs = {street1: [street2, street3, street5], street2: [street1, street4], ...}.

street 1 is connected to street 2, 3, 5; street 2 is connected to street 1 and 4; etc.. There are around 12000 streets in the study area most have fewer than 7 connected streets. I have tried to do this using python-only and have not succeeded (see here).

SelectLayerByLocation_management seems insanely slow. My read of the forums has turned up little, one person suggests recursively buffering each street and then making an in-memory layer but it is unclear to me how this would add efficiency (it increases the required number SelectLayerByLocation calls, doesn't it?).

A pickled version of the st2neighs used in the code below is here. The intended behavior is:

For each street in the city (focal street):
[INDENT]For each of the focal streets neighbors (focal street neighbor):
[INDENT]Select the neighbors of the focal street neighbor using SelectLayerByLocation
[/INDENT][/INDENT]

Any thoughts on why the code below is taking so long to run? Did I make a mistake? By my estimates the code involves ~ 72000 calls to SelectLayerByLocation. The code has been running about 20 hours so that's about 1 second per SelectLayerByLocation (and it isn't done yet). Is this normal? Can the efficiency be improved somehow? Thanks for your help and code follows...

##This Script Creates and pickles a dictionary of street connectivity

########################
##IMPORT LIBRARIES
########################

import cPickle as pickle
import arcpy

##################
##LOAD DATA
##################
#Set workspace to geodatabase
arcpy.env.workspace = "aGDB.gdb/aLayer"
arcpy.env.overwriteOutput = True

st_file = open("st2neighs.pkl", "rb")
#st2neighs contains first order neighbors
st2neighs = pickle.load(st_file)

#Make layers
arcpy.MakeFeatureLayer_management("SSOStreetsMerged", "strLyr")

#Second order neighbors
for street in st2neighs:
    for astreet in st2neighs[street]:
        arcpy.SelectLayerByAttribute_management("strLyr", "NEW_SELECTION", "\"ID\" = " + str(astreet))
        arcpy.SelectLayerByLocation_management("strLyr", "INTERSECT", "strLyr", "", "NEW_SELECTION")
        neighs = arcpy.SearchCursor("strLyr", "", "", "", "")
        for aneigh in neighs:
            neigh = int(aneigh.getValue("ID"))
            if neigh != street:
                if neigh not in st2neighs[street]:
                    st2neighs[street].append(neigh)
        
pfile = open("st2neighs2.pkl", "wb")
pickle.dump(st2neighs, pfile)
pfile.close()

SethSpielman · ‎07-27-2011

After some messing I solved this with minimal use of arcpy. I used arcpy in the first code block to create the pickled dictionary referenced in the previous post and then the second code block I use to very quickly calculate higher order neighbors. By removing arcpy from the mix computation time goes down from hours to seconds. Designed to work with a TIGER street file.

Hopefully, this will save someone the the days of suffering I just endured!

To create a dictionary of first order street network neighbors:

##This Script Creates and pickles a dictionary of street connectivity
##Designed to work with a TIGER street file

########################
##IMPORT LIBRARIES
########################

import cPickle as pickle
import arcpy

#Set workspace to geodatabase
arcpy.env.workspace = "aGDB.gdb/aFC"
arcpy.env.overwriteOutput = True

##################
##CREATE VARIABLES
##################
st2neighs = {}

#Make layers
arcpy.MakeFeatureLayer_management("SSOStreetsMerged", "strLyr")

#Make cursor of all streets
sts = arcpy.SearchCursor("SSOStreetsMerged", "", "", "", "")

#Create a dictionary containing each street and its neighbors
for ast in sts:
    st = int(ast.getValue("TLID"))
    if st not in st2neighs:
        st2neighs[st] = []
        arcpy.SelectLayerByAttribute_management("strLyr", "NEW_SELECTION", "\"TLID\" = " + str(st))
        arcpy.SelectLayerByLocation_management("strLyr", "INTERSECT", "strLyr", "", "NEW_SELECTION")
        tsts = arcpy.SearchCursor("strLyr", "", "", "", "")
        for tst in tsts:
            if int(tst.getValue("TLID")) != st:
                if int(tst.getValue("TLID")) not in st2neighs[st]:
                    st2neighs[st].append(int(tst.getValue("TLID")))
   

##Pickle seg2st dictionary
pfile = open("st2neighs.pkl", "wb")
pickle.dump(st2neighs, pfile)
pfile.close()

Then to trace the higher order network neighbors without using arcpy (select by location):

##Select K-order neighbors from a set of sampled streets.
##saves in dictionary format such that
##the key is the sampled street and the neighboring streets are the values

##################
##IMPORT LIBRARIES
##################

import random as random
import cPickle as pickle
import copy

#######################
##LOAD PICKLED DATA
#######################

seg_file = open("seg2st.pkl", "rb")
st_file = open("st2neighs.pkl", "rb")
seg2st = pickle.load(seg_file)
st2neighs = pickle.load(st_file)
st_file.close()
seg_file.close()
st2neigh = copy.deepcopy(st2neighs)
##################
##FUNCTIIONS
##################


##Takes in a dict of segments (key) and their streets (values).
##returns the desired number of sampled streets per segment
##returns a dict keyed segment containing tlids.
def selectSample(seg2st, nbirths):
    randSt = {}
    for segK in seg2st.iterkeys():
        ranSamp = [int(random.choice(seg2st[segK])) for i in xrange(nbirths)]
        randSt[segK] = []
        for aSamp in ranSamp:
            z = copy.deepcopy(aSamp)
            randSt[segK].append(z)

    return randSt
    
##Takes in a list of all streets (keys) and their first order neighbors (values)
##Takes in a list of sampled  streets
##returns a dict of all sampled streets and their neighbors.
##Higher order selections should be possible with findMoreNeighbors
##logic is the same but replacing sample (input) with output from
##findFirstNeighbors

def initList(sample):
    compSts = {}
    for samp in sample:
        for rSt in sample[samp]:
            if rSt not in compSts:
                compSts[rSt] = []
    return compSts

def findFirstNeighbors(st2neigh, compSts):
    for x in compSts:
        st = copy.deepcopy(st2neigh)
        for z in st:
            compSts.append(z)
    return compSts

def findMoreNeighbors(st2neigh, compSts):
    for aSt in compSts:
        stx = copy.deepcopy(compSts[aSt])
        for st in stx:
            nStx = copy.deepcopy(st2neigh[st])
            for nSt in nStx:
                if nSt not in compSts[aSt]:
                    compSts[aSt].append(nSt)
    return compSts

#####################
##ANALYSIS
#####################

samp = selectSample(seg2st, 1)
n0 = initList(samp)
n1 = findFirstNeighbors(st2neigh, copy.deepcopy(n0))
n2 = findMoreNeighbors(st2neigh, copy.deepcopy(n1))
n3 = findMoreNeighbors(st2neigh, copy.deepcopy(n2))
n4 = findMoreNeighbors(st2neigh, copy.deepcopy(n3))
n5 = findMoreNeighbors(st2neigh, copy.deepcopy(n4))
n6 = findMoreNeighbors(st2neigh, copy.deepcopy(n5))
n7 = findMoreNeighbors(st2neigh, copy.deepcopy(n6))
n8 = findMoreNeighbors(st2neigh, copy.deepcopy(n7))
n9 = findMoreNeighbors(st2neigh, copy.deepcopy(n8))
n10 = findMoreNeighbors(st2neigh, copy.deepcopy(n9))
            
#####################
##CHECK RESULTS
#####################
def checkResults(neighList):
    cntr = {}
    for c in neighList.iterkeys():
        cntr = 0
        for a in neighList:
            cntr += 1
    return cntr

c1 = checkResults(n1)
c2 = checkResults(n2)
c3 = checkResults(n3)
c4 = checkResults(n4)
c5 = checkResults(n5)
c6 = checkResults(n6)
c7 = checkResults(n7)

EhsanAbshirini1 · ‎04-22-2016

hi

i know it's been a long time ago, but i would appreciate it if you could make a script( defining parameters ans environment) so people like me who don't have a good knowledge of programming can run the script in arcgis.