Best hot spot method for analyzing water main breaks?

1159
3
09-16-2021 11:16 AM
Labels (1)
AaronManuel2
New Contributor III

Looking for advice on the best hot spot tool parameters to use for analyzing water main breaks. I export break data from our AMS, and then dissolve the table on our unique Id for our waterlines, so I get 1 row for each line segment, with a column that has a sum value of breaks. I then join that to the waterlines.

From there I use either the lines, or a point feature created from line centroids for visualization. 

I've used the hot spot tools a little bit over the years but to be honest how they actually work is a mystery to me . Specifically in regards to what spatial relationship method would make the most sense for looking at water main breaks.

the thing about looking at break data is that the breaks on one main may not have any bearing on the breaks on a main that is just the next street over. It could be that main A has a lot of breaks because its old, and main B has a lot of breaks because of the material. 

Basically what I would like to do is identify problem neighborhoods, using the sum total breaks data that we have. We do this now based on the line segments, but I'd like to do something that gives us some results with more aggregation and not feel like I'm just winging it so much.

Anyone in the water utilities biz have any advice? Thanks.

Tags (3)
0 Kudos
3 Replies

I've updated a couple existing organization processes for this into a Jupyter notebook, shared here.

https://github.com/gontek/CMED/blob/main/AssetPrediction.ipynb

https://github.com/gontek/CMED/blob/main/WaterMainAssessment.ipynb

I know of other people in the business doing great research into this stuff utilizing AI to explore deeper into what causes breaks and poor service.  You are right that there are many factors to consider and each system is different and unique, and you are very fortunate if you have good break/leak data to work with.   

 

 

 

AaronManuel2
New Contributor III

Thanks Kyle, this looks great. 

For your scoring, is this something based on an industry standard or did you develop it yourself? If I'm looking at your scripts right it looks like you are summing up the individual score categories to get a total score number of points.

My struggle has been trying to come up with some kind of scoring matrix that doesn't feel arbitrary. 

Regarding the data, our data quality is ok. Not great but I've seen worse. We have decent data on breaks at least, which is what we try to focus on the most.

0 Kudos

 I didn't develop the approaches, I only programmed and tried to document it in the jupyter notebook.  It's just like you said, its ages, materials, any factors you can identify, then relationships to breaks, put things in bins or categories, and try to figure out what makes sense - that's where the AI approach would be helpful in making it less arbitrary and examining sensitivity of various factors in your system (its the scientific method - guess and test).  It would be interesting to do more comparison with relation between breaks and weather data as well.  

I have tried looking at emergent hot spots but the break data isn't dense enough over time to make real useful statistics  determinations.  

0 Kudos