seth.spielman

Efficient K-Order network neighbors?

Discussion created by seth.spielman on Jul 27, 2011
Latest reply on Apr 22, 2016 by ehsana
I would like to identify the Kth order neighbors of an edge on a network, specifically the neighbors of a large set of streets. For example, I have a street that I'm interested in looking at, call this the focal street. For each focal street I want to find the streets that share an intersection, these are the first order neighbors. Then for each of those streets that share an intersection with the focal street I would like to find their neighbors (these would be the second order neighbors), and so on...

Calculating the first order neighbors using ArcGIS' geoprocessing library (arcpy) took 6+ hours, second order neighbors has been running for 20+ hours. Needless to say I want to find a more efficient solution.  I have created a python dictionary which is keyed on each street and contains the connected streets as values. For example:

st2neighs = {street1: [street2, street3, street5], street2: [street1, street4], ...}.  


street 1 is connected to street 2, 3, 5; street 2 is connected to street 1 and 4; etc.. There are around 12000 streets in the study area most have fewer than 7 connected streets.  I have tried to do this using python-only and have not succeeded (see here). 

SelectLayerByLocation_management seems insanely slow.  My read of the forums has turned up little, one person suggests recursively buffering each street and then making an in-memory layer but it is unclear to me how this would add efficiency (it increases the required number SelectLayerByLocation calls, doesn't it?).

A pickled version of the st2neighs used in the code below is here.  The intended behavior is:

For each street in the city (focal street):
[INDENT]For each of the focal streets neighbors (focal street neighbor):
[INDENT]Select the neighbors of the focal street neighbor using SelectLayerByLocation 
[/INDENT][/INDENT]

Any thoughts on why the code below is taking so long to run?  Did I make a mistake?  By my estimates the code involves ~ 72000 calls to SelectLayerByLocation.  The code has been running about 20 hours so that's about 1 second per SelectLayerByLocation (and it isn't done yet).  Is this normal?  Can the efficiency be improved somehow?  Thanks for your help and code follows...

##This Script Creates and pickles a dictionary of street connectivity

########################
##IMPORT LIBRARIES
########################

import cPickle as pickle
import arcpy

##################
##LOAD DATA
##################
#Set workspace to geodatabase
arcpy.env.workspace = "aGDB.gdb/aLayer"
arcpy.env.overwriteOutput = True

st_file = open("st2neighs.pkl", "rb")
#st2neighs contains first order neighbors
st2neighs = pickle.load(st_file)

#Make layers
arcpy.MakeFeatureLayer_management("SSOStreetsMerged", "strLyr")

#Second order neighbors
for street in st2neighs:
    for astreet in st2neighs[street]:
        arcpy.SelectLayerByAttribute_management("strLyr", "NEW_SELECTION", "\"ID\" = " + str(astreet))
        arcpy.SelectLayerByLocation_management("strLyr", "INTERSECT", "strLyr", "", "NEW_SELECTION")
        neighs = arcpy.SearchCursor("strLyr", "", "", "", "")
        for aneigh in neighs:
            neigh = int(aneigh.getValue("ID"))
            if neigh != street:
                if neigh not in st2neighs[street]:
                    st2neighs[street].append(neigh)
        
pfile = open("st2neighs2.pkl", "wb")
pickle.dump(st2neighs, pfile)
pfile.close()

Outcomes