Sorting in a dictionary for a newbie

2768
15
06-30-2017 10:55 AM
MattBeyers
Emerging Contributor

I'm pretty much a newbie to Python in ArcGIS. I have a table of origins and destinations and their distance. I want to be able to choose the closest n destinations for each origin and write that to a new table. I had been using cursors with SQL and OrderBy but that was taking way too long. I found Richard Fairhurst's Blog posts about using dictionaries (/blogs/richard_fairhurst/2014/11/08/turbo-charging-data-manipulation-with-python-cursors-and-diction...) but I'm afraid the logic is beyond me right now.

I've put the relevant piece of code below. 'neighborconstraint' is the n closest destinations that the user has specified. 'DistanceSort' is the table that we're writing to. 'Matrix' is the input table. The second table, 'NeighborLimit,' selects the nearest n distances.

Any help?

#Have table of Origin (BID) to Destination (SID) distances (Distance). Want to select a neighbor limit of shortest distances.
DistanceSort = "DistanceSort"
#Create empty output table, where insert cursor will put distances in ascending order, grouped by BID
arcpy.CreateTable_management(outpath, DistanceSort)
arcpy.AddField_management(outpath + DistanceSort, "BID","LONG")
arcpy.AddField_management(outpath + DistanceSort, "SID","LONG")
arcpy.AddField_management(outpath + DistanceSort, "Distance","DOUBLE")
arcpy.AddMessage("Empty distance sort table created")

#Faster to use dictionary that sorts distance in ascending order, then insert???
#Prepare insert cursor to insert BID, SID, and Distance according to to SQL clause. Inserted into the the DistanceSort table 
iCursor = arcpy.da.InsertCursor(outpath + DistanceSort, ["BID","SID","Distance"])           
with arcpy.da.SearchCursor(outpath + Matrix, ["BID","SID","Distance"], sql_clause=(None, 'ORDER BY BID, DISTANCE')) as cursor:
    for row in cursor:
        iCursor.insertRow(row)
del iCursor
arcpy.AddMessage("Insert cursor created to insert sorted distances, SID, and BID")

#Create empty output table, where insert cursor will put distances under the Neighborlimit.
arcpy.AddMessage("Calculating neighbor constraint")
NeighborLimit = "NeighborLimit"                    
arcpy.CreateTable_management(outpath, NeighborLimit)
arcpy.AddField_management(outpath + NeighborLimit, "BID","LONG")
arcpy.AddField_management(outpath + NeighborLimit, "SID","LONG")
arcpy.AddField_management(outpath + NeighborLimit, "Distance","DOUBLE")
arcpy.AddMessage("Empty neighborlimit table created")

BIDList = sorted({row[0] for row in cursor})
arcpy.AddMessage("sorted BID list created")

iCursor = arcpy.da.InsertCursor(outpath + NeighborLimit, ["BID","SID","Distance"])
for BID in BIDList:
    expression = arcpy.AddFieldDelimiters(outpath + DistanceSort, "BID") + ' = ' + str(BID)
    with arcpy.da.SearchCursor(outpath + DistanceSort, ["BID", "SID", "Distance"], where_clause=expression) as cursor:  #or keep fields as  ["BID","SID","Distance"]?
        x = 1
        for row in cursor:
            if x < int(neighborconstraint) + 1:
                iCursor.insertRow(row)
                x = x + 1                        
del iCursor
0 Kudos
15 Replies
BlakeTerhune
MVP Regular Contributor

So you've got what sounds like a Near Table (no matter how it was created) and want to just grab the nearest n destinations for each OID? See if something like this works. Feels a little clunky with the conditionals but it was the easiest solution that came to mind.

num_results = 3  ## set to how many SID matches you want for each OID

# Create output table with no rows
arcpy.TableToTable_conversion(inTable, outLocation, outTable, "1=0")

fields = ["OID","SID","Distance"]
with arcpy.da.SearchCursor(inTable, fields, sql_clause=(None, 'ORDER BY OID, DISTANCE')) as s_cursor:
    with arcpy.da.InsertCursor(outTable, fields) as i_cursor:
        oid_track = None
        oid_count = 1
        for oid, sid, dist in s_cursor:
            if oid == oid_track:
                if oid_count <= num_results:
                    oid_count += 1
                    # write to outTable
                    i_cursor.insertRow((oid, sid, dist))
            else:
                oid_count = 1
                oid_track = oid
                # write to outTable
                i_cursor.insertRow((oid, sid, dist))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

If that's still too slow, we can look at trying to improve performance with dictionaries.

MattBeyers
Emerging Contributor

Thanks, I will try this particular code. We had gotten it to work with a SearchCursor and an SQL clause before. We have (in our particular dataset though we want it to be generic) 70,000 origins and 1,800 destinations. It was taking a long time so we were trying to do it in a dictionary per rfairhur24's suggestions.

0 Kudos
MattBeyers
Emerging Contributor

That seemed to work and took only four hours with the full dataset. I'd still like to try a dictionary to see if it goes faster, but this will be good enough for now.

0 Kudos
BlakeTerhune
MVP Regular Contributor

If you can work out a way to get an Advanced license, Generate Near Table is really what you need here because you can specify the closest_count parameter to limit the number of results. This will probably improve performance.

Because you're not really doing much "processing" with the data, using dictionaries isn't likely to speed it up too much. However, you could try replacing the insert cursor with something that builds a numpy array and then use NumPyArrayToTable to create the final output table.

0 Kudos
DanPatterson_Retired
MVP Emeritus

Blake.. for a numpy solution generate-near-table-for-points

 and for Pro Point Tools

The numpy algorithm and solution is there-in.  If you are talking planar distances and closeness, I have another blog post in my archive which shows that a pure numpy solution can do about 50,000,000 Orig-Destination calculations in under 1 second, and extracting the closest X is easy from there.  If they are network distances, then that will have to wait.

BlakeTerhune
MVP Regular Contributor

Well, well, look at that. Mr. NumPy himself has already built his own Generate Near Table! Thanks Dan Patterson‌!

0 Kudos