So the 24th and 25th the syncs were not completing. They timed out with failures and left behind the crpk files in my agol content root. For us, we have 14 distributed collaborations sending content from portal to agol by reference. They syncs are by schedule every 24 hours in from 12 am to 7am. They have been in place since 2018.
We use the by reference method because we overwrite sd files on portal weekly via the arcpy sharing module in pro.
I've learned that, because the content is by reference and not copies, that once the portal hosted feature layer data updates, the data update is reflected updated on agol regardless of sync since the layer is just pointing back to my hosted rest endpoint.
Like @HenryLindemann says, the primary cause of failure for us is almost always poor network, sometimes ours, but mostly it's on the AGOL/AWS end.
But I can say that I have never had a collaboration sync fail because of a schema change to an existing collaboration item, or because of adding an additional item to an existing collaboration.
At any rate, it is aggravating when it happens, but I just kill the crpk's and wait. This time the syncs succeeded by the 26th. I probably wouldn't be so forgiving on this if I were collaborating live edit layers by copy, where a sync failure could lead to data loss....