Hi Jonathan,
Great to hear from you.
I guess the reason why i do not say portal is HA is really due to me losing the battle of that argument with corporate EAs. HA means all components can work actively, not that 'part' of it fails over. Yes, the web adaptors can both work, portal itself can as well. what can not do HA is the darn PosgreSQL. which makes Portal software not HA.
Content directory is not a challenge as one can set up content directory on a NAS which inherently has HA built into it. typically three network channels (fibre/cable) with a complex RAID. so no problem there.
Correct, OS patching has different schedules for each of the machines I had setup, including the License server and the FO license server, pairs of the AGS servers, one of each portal and Web servers. all through an NLB, all using VIPs. The corruption occurred when the VM team had patched and rebooted one side, and begun patching the other, but the patch was small enough that the VM restart cycle time was shorter than the (4 minutes) it takes for Portal to replicate over... it was all over at this point. I am aware that newer portal versions have reduced this time... to maybe under two minutes.
(which comes to another point, not able to configure Active Directory LDAP to a VIP, only IPs... (that is for another conversation).
Id love to describe the corruption that happened but I don't think there is enough room on this forum . There have been a few cases on this (case #02200912). Escalated to Redlands without resolve (Esri Case #02377858)
I am confident there is corruption within PostgreSQL, but have never had the time to look at this further.
I can come by and discuss if you are attending the International Developer Summit.