Network Analyst - Centroid of road network

JamiePetersen · ‎07-07-2010

Looking for a nudge in the right direction here.
What I need to do is find the "centroid" of my road network. I actually want to find two different points. One (equidistant), if you started at this point and drove x miles in any direction you would be at the end of the road network. And two (equitemporal), if you started at this point and drove x minutes in any direction you would be at the end of the road network.
Thanks in advance.

MichaelRice · ‎07-08-2010

In order to calculate such a position on a road network (which is represented as a graph), you would ultimately be looking for the position with the smallest graph eccentricity (see http://mathworld.wolfram.com/GraphEccentricity.html for a description of this concept), which defines the radius of the graph. The only out-of-the-box way to do this in Network Analyst would be the following:

Create an OD Cost Matrix analysis layer (set your impedance attribute to either a time- or distance-based cost attribute, depending on whether or not you wish to find the time- or distance-based center, respectively)

Load all of your network junctions as both Origins and Destinations in your analysis layer

Solve the OD instance (this essentially calculates an all-pairs shortest paths problem)

For each origin, take the maximum cost to reach any destination; select the origin with the minimum such value for this maximum destination travel cost; this origin defines the center based on the graph radius

Of course, this methodology will only provide you with the junction which has the smallest eccentricity. If you are looking for a more general position along the graph (e.g., potentially somewhere along an edge), then the process would be more complicated.

Also, this method will not scale very well for very large graphs (with more than approximately 5000 junctions), due primarily to memory limitations for solving such large OD instances. How big is the network you are planning on computing this value for (in terms of the number of junctions and edges)?

JamiePetersen · ‎07-08-2010

My network has about 7,000 junctions and about 5000 edges - My region is mostly rural with a few urban centers and in light of what you've said, I can simplify my urban areas to consist of only major roads and collectors. This should reduce the number of junctions significantly.
I won't have a chance to try the steps you provided (but they do seem logical) for at least a week as I'll be at the User Conference. I'll report back here once I've had a chance to give it a go.
Thanks for the response.

MichaelRice · ‎07-08-2010

I will be at the UC as well, staffing the Spatial Analysis Island on Tuesday through Thursday of next week. Please feel free to stop by and talk with me (or anyone else on the Network Analyst team), if you get a chance. I am interested in better understanding your use case for discovering such graph centers, and maybe figuring out how we can better support this case.

MichaelRice · ‎07-09-2010

Bill,

Our respective formulations of this problem are merely two slightly-different interpretations of the same general property.

The goal (as I understood it) was to determine a single position on a graph, from which you can reach all other positions in the graph in <= x units, where x is the minimum possible value.

When said "positions" are limited to vertices, this is exactly what my formulation provides (based on the well-known concept of graph radius), and it applies to all graphs (even your example triangle graph). In your formulation, it appears that you have merely extended this concept of "position" to apply to both vertices and edges (i.e., a "position" can mean any position along an edge as well).

As we have both pointed out, this latter formulation is a bit trickier. However, as you also suggested, this latter formulation can still be approximated using the original concept of graph radius (which applies only to vertices) by simply introducing new dummy vertices along any "long-distance" edges (and thus, "splitting" the edge into shorter constituent edges).

So, the good news is, at the very least, it appears that we have some general understanding of the various possible formulations of the originally-suggested problem. I will leave it to Jamie (the originator of this thread) to decide which is more suitable for their purposes, and we can go from there.

Jamie, can you perhaps tell us why it is that you wish to determine this "centroid" position in the first place? Depending upon your actual use case, there may be other, simpler ways of achieving your goal.

JamiePetersen · ‎07-09-2010

First, thanks for the replies - a very interesting discussion!

My scenario: a tri-county region is putting together a proposal for a "Farm to Fork" distribution center. Everyone wants it to be in a "fair" location. Maybe centroid wasn't the best word to use to describe what I'm looking for - another way to put it - the junction that best distributes the burden of time and distance between all other junctions. I am ok with limiting this to only junctions and not using any vertex along an edge.

I did have a chance to test michael's method. I've attached an image of the results. I'm pretty happy with how it turned out. I did this for both time and distance and ended up with a slightly different junction for each. Step #4 was the trickiest part for me. I used the sql below to select the top cost for each group of originIDs and from this set of top costs, I selected the originID with the lowest cost.
I will try to stop by the SA display to talk about this.

SELECT     OriginID, DestinationID, Total_feet
FROM         distanceCOSTTABLE
WHERE     (Total_feet =
                          (SELECT     MAX(Total_feet) AS Expr1
                            FROM          distanceCOSTTABLE AS f
                            WHERE      (f.OriginID = distanceCOSTTABLE.originid)))

ORDER BY total_feet

MichaelRice · ‎07-09-2010

Jamie,

This is quite interesting. This could be formulated as a classic location-allocation problem type. That is, given a set of k candidate locations to choose from, decide which j (< k) locations (in your case, j=1) are best suited to "service" a set of demand points, such that each demand point gets allocated to its nearest chosen location.

There are several ways to formulate the objective function for solving such a problem (e.g., minimize the total weighted impedance to the demand locations, maximize the number of demand locations which are within a specified coverage impedance, etc.). So far we have been discussing this problem in the context of finding the single junction (the "service" location) which minimizes the maximum impedance to all other junctions (the "demand" locations). However, we could easily reformulate the problem to choose the junction which minimizes the total (and thus, average) impedance to all other junctions. This helps to eliminate the effects of having one or two outlier demand locations. The point is that there are many available options for defining your objective function for this problem type.

We have developed a new solver in Network Analyst at the 10 release to solve just this problem type: the Location-Allocation solver. Its capabilities are much more advanced than I have even attempted to describe here, with many more options for greater flexibility in defining your objective function for selecting/locating your facilities.

I would imagine if your tri-county region is serious about establishing a proper location for their distribution center, then they have to consider the possibility that they might not be able to simply establish such a facility at any possible junction in the network (e.g., perhaps there is no available real estate in a given area of your network coverage). Therefore, using the Location-Allocation solver, they could effectively specify which candidate locations (even ones located midway along edges) that they are willing (and able) to select from, then specify the number of the distribution centers they are planning to establish (e.g., 1, so far in our discussion), and then specify exactly (and only) those locations which the distribution center would truly be servicing (this may or may not include all junctions of the network; there may be a small number of explicit locations to be serviced by this distribution center).

Please stop by the island at the UC and we can discuss the options in further detail.

MichaelRice · ‎07-09-2010

Well, it's not quite as simple as you seem to suggest...

Bill,

My point was that, if you are willing to accept some approximation to your formulation using my original suggestion (i.e., considering only vertices), then you could eliminate some of the negative impacts of having "long-distance" edges by splitting them up before-hand (and doing this only once, before running the originally-suggested algorithm). This is clearly not an optimal solution, nor was it suggested as such; perhaps it was not what you were originally hinting at, but it is indeed quite simple. This suggestion could be done using our existing out-of-the-box software. Any dynamic algorithm, such as you suggested, while nice, would require the development of a custom solver.