I found something interesting happening with the search tolerance in Network Analyst (the default Closest 5000 metres one) and would be grateful if anyone could confirm that what I'm doing is correct.
Basically, I have several thousand post codes (origins) and a list of shops (destinations) and want to count the number of shops within 800 metres of each postcode (using the Origin Destination or ODCost Matrix method).
First I ran the analysis and set the search tolerance to the Closest 50 metres and saved the output (lines and distances).
Secondly, I run the analysis again on the remaining postcodes that did not 'snap' onto the network the first time by setting the search tolerance to First Road Link within 100 metres.
Thirdly I used the default Closest 5000 metres search tolerance to capture any remaining postcodes and shops.
Although this takes a bit longer i.e. 3 separate runs, all of this process was successful and is really the preferred methodological way to do it where I work i.e. lower search tolerance at first to capture most features then increasing the search tolerance step by step until all the remainder of the data is captured.
Well, what was interesting was that when I ran this OD Cost Matrix analysis again using the default Closest 5000 metres first time only, I found that the results were exactly the same as when I ran the 3 separate analyses with different search tolerances. It wasn't that it was close to the previous results - they were precisely the same results down to a millimetre!
To me this is great news and saves a lot of time but my manager seems to think that the first approach (starting low then increasing search tolerance) is better.
Please could anyone with a bit of Network Analyst experience confirm that it is indeed ok to just use the default 5000 metre search tolerance (in most cases) rather than 3 separate runs with all the usual joining files etc. assocoated with it. I know this depends on your data but I just mean generally.
The fact that the results were exactly the same to a tee (and this was thousands of measurements) really makes me believe that the best approach is to use the default search tolerance where possible because it locates the feature onto its closest or first part of the network anyway.
My manager needs convincing even though I showed her that the results were the same and would be grateful if anyone else could confirm they use the same method.
I hope this makes sense and that the information is of some use to some Network Analyst users - it could save time.
Thanks again and any comments are welcome.
Regards
Scottaidh
In order to calculate an OD Cost Matrix (or any of the other Network Analyst analyses), you have to add your point locations to the NA layer. For instance, you have to take your post codes and add them as Origins. What this procedure does is it examines the original geographic location of the post codes and finds the location on the network where that location falls. So, if a point falls a few meters off the side of the road, it assumes that the starting point of any journey originating at that point would just be the closest point on the road network to that point. It's like the point where the driveway touches the street, even if you would have to drive 100 meters down the driveway to actually reach the front door.
The Search Tolerance simply determines how far from a road a point is allowed to be before it is considered to be too far from the network to be considered in the analysis. For instance, if you have a point that accidentally got placed way out in the middle of the ocean, a search tolerance of 5000 meters will generally ensure that that point is excluded from the calculation (and you'll see an error on that point saying it couldn't be located on the network).
The reason you would want to use a very small search tolerance is if you have some compelling reason to insist that all points must be very close to the network. If, for your analysis, you don't care, as long as they're not accidentally placed way out in the middle of the ocean, then the default search tolerance, or any reasonable search tolerance you determine, should be fine.
You got the same results for each OD pair when loading them all together with a large search tolerance and when you loaded them in three tiers of search tolerance. This is completely expected, and correct. The travel time between two points on the network should be exactly the same if those two points snapped to the same point on the network each time you loaded them. Since search tolerance is simply a limit after which the point will be ignored, it should always locate at the same point if it locates at all, unless you adjust the location settings in some other way.
Now, if you're trying to calculate shops within 800 meters of each postcode, and your search tolerance is 5000 meters, it is conceivable that you might have a few problems. The 800 meter limit you can set on your OD Cost Matrix is 800 meters of network distance. The setback from the street isn't factored in. So, if you have a post code that is set back 4000 meters from the street, and a shop that is set back 4000 meters from the street, but they snap to points on the road that are only 500 meters from one another, then they will show up as below your 800 meter threshold, even though if you consider the distance you have to walk along the driveways to get to the actual locations set back from the street, it would be 8500 meters.
If you wanted to get really fancy, you could pre-calculate the distance to the network and add that in as an extra, individual amount of impedance for each Origin and Destination (you can use the Attr_ field).
That said, what are your post codes, anyway? Are they just centroids of some polygon postal code region? If so, then the actual location of the centroid isn't very meaningful. I mean, a postal code doesn't have a driveway, right? So the road setback may not be worth considering to the level of exactness I just described. It's up to you to figure out what works best for your data, but hopefully the above description will help your boss understand the nuances of search tolerance a bit better. There's no need to do the analysis in three tiers if the only thing you would be changing is the search tolerance.