Select to view content in your preferred language

Data Pipelines - scheduled runs timing out

130
0
a month ago
Status: Open
JamesDrumm2
New Contributor III

In the past week a handful of our data pipelines have failed to complete and sat in a pending status for 2 hours before timing out at the 2 hour mark. We were charged 2 hours worth of credits unknowingly for each pipeline that failed to complete during this timeout issue.  Nothing was edited or changed within the pipeline and whatever caused them to timeout in the middle of the night fixed itself somehow on the next scheduled run. This leads to a few ideas for improvements:

1. The timeout limit should be much shorter than 2 hours and/or should be a flexible amount of time based on a rolling average of the previous runs.  Ex: It takes about 2-3 minuets for my pipeline to run on average for the past 3 days.  The timeout period should be set to 2 or 3x that time = 6-9 minutes or so.

2. Event logging for failed scheduled runs. Real time error logging is great, but why did my pipelines randomly get stuck in this suspended state and then seemingly randomly fix themselves?  I have no idea without a log.