Error in crash count while spatial joining with road segments

Michigan_StateUniversity · ‎04-20-2018

I am trying to join the crash data along segments and within 150 ft radius of intersections with the nearest segments. I am using Spatial Join, with roads layer, selecting "closest to it" option and "sum" as an answer to the question "How do you want the attributes to be summarized?", because I want the total count of crashes (both midblock and intersection) along each segment. Though the spatial join is successful, I am getting a wrong count of total crashes, and I can validate this by checking the crash data. In the joined data, the total count of crashes combining all segments is more than the total count of crashes in the crash data. This makes me think, probably, some crashes are getting counted more than once during the spatial join. Clearly, this method is not working. Please suggest if you have any other methodology for solving this issue, it does not have to be necessarily a spatial join. Thank you for your help.

DanPatterson_Retired · ‎04-20-2018

verify that by using the second radiobutton option that gives the distance to the line.

alternatives... buffer the line by a small/reasonable amount, then summarize within

Michigan_StateUniversity · ‎04-20-2018

I tried with a buffer, but it did not work, unfortunately, because of two reasons. 1) It is double counting the crashes on segments that are too close, and 2) it is not able to add crashes within 150 ft of the end of the segments (i.e intersections).

DanPatterson_Retired · ‎04-20-2018

You may have a variety of 'corner cases' preventing one solution working for all cases.

Michigan_StateUniversity · ‎04-20-2018

Yes, that could be a case. Do you suggest any other tool that can resolve it? I am wondering what is the most common way to do it? Because getting the crash count for road segments and intersections is a fairly common functionality and every transportation agency will need this.

DanPatterson_Retired · ‎04-20-2018

I have worked with crash data in the past (Canadian though) and partitioning the problem seems to work best.

Most of the accidents (a relative term) will be denoted in close proximity or on the road. Do a select by location and get those (ie within 10m or some threshold) and add a field to the table and give them a class. Switch the selection and select from the remainder those that are within another distance band (ie10- 20m.) and class those. Produce a final class for the real outliers and give those a class.

When it comes to intersections, if you have an intersection point file, it might be worthwhile produce buffers around those points and pulling out the accidents that are within X meters of an intersection, particularly if there is approach and travel direction information associated with the crash.

I have never seen one solution work for all cases. In some instances, the reporting methodology is important. I doubt that exact locations are given in all cases at all times, some data I have seen just records to the closest kilometer post etc. but I am sure that you are aware of these.

If you deal with the vast majority of the cases, then you can deal with the remainder in different ways. You may find that you spend more time trying to find the one size fits all solution that you would a complete analysis as the sum of parts.

Good luck

Michigan_StateUniversity · ‎04-20-2018

Thank you. Will see if this works.