Seeking Advice on Applying Time-to-Event Analysis for Pipe Burst Prediction

Cyrus_Chu

Hi everyone,

I'm currently exploring the use of Time-to-Event (Survival) Analysis to predict pipe burst incidents and I’d love to hear from anyone with experience in this area.

I have two datasets:

Pipe attributes: completion date, material, ground surface type, pipe age, tube diameter, etc.
Historical burst events: joined with the pipe dataset using attribute join.

I created a new field called Event_Indicator:

0 for pipes that have experienced a burst
1 for pipes that haven't

For the Age Field:

Pipes without a burst: current date - completion year
Pipes with a burst: burst date - completion year

Challenges

I’ve added explanatory variables like pipe material and ground surface type. However, some categories have no associated burst events, which seems to cause the prediction model to fail or behave unpredictably. I suspect the Time-to-Event model has default assumptions that require sufficient event representation across categories. I'm considering grouping sparse categories to improve model stability.

Questions

Has anyone applied Time-to-Event modeling for pipe burst prediction or similar infrastructure failure analysis?
What strategies have you used to improve prediction accuracy, especially when dealing with sparse categorical variables?
What types of reports or diagnostics can I generate to evaluate model accuracy and identify the most influential variables driving the event?
Is it normal that the model only outputs Median Time to Event, Percentiles, and Deviance? Does Time-to-Event modeling typically provide a range rather than a specific predicted time?

This is my first time working with this tool and I’d really appreciate any insights, tips, or shared experiences to help me get more familiar with it.

Thanks in advance!