How I handle this...
I have an old legacy geocoding script that takes-in batch CSVs every night... it runs well 99.9% of the time, but very occasionally fails. I have a verbose log written, tying into the geoprocessers' messaging feature, for every file processed (if verbose logging is enabled). I have some error handling in the geoprocessing to continue and attempt a reprocess whenever feasible, and if can't complete, an email is spawned-off with the tasks/log messages for the file as an output.
If you're running into some append failures, you might want to consider writing the processes out to a text file - it was really helpful when I initially developed this task years back.