Hello,
We are running a site with 3 ArcGIS Servers (10.7.1) sharing the load for our map services on VMs running Windows Server 2016. We do not use Portal. Each evening we reboot one of the three servers to maintain good performance. However, on a few occassions the ArcGIS Server service did not restart. The 'startup' mode is correctly set to 'Automatic(Delayed)'. The server was marked as "unhealthy" by the web adaptors and all traffic was re-routed to the remaining servers. This lead to overloaded servers and additional problems.
Every service running under Windows Server has 'Recovery' options that can be configured to attempt to restart a service that fails to start. Have you experimented with the 'Recovery' options for the ArcGIS Server service? Has it helped? What settings did you use?
Thanks,
Bernie.
10.7.1?
I have read about hiccups like that with the web adaptors, I don't remember if esri patched it back to 10.7.1
We are also experiencing this on an irregular basis with our standalone ArcGIS Servers (11.2) running the stock options since we use the PowerShell DSC for ArcGIS.
My first thought was a Python microservice to check the windows service status, but it looks like the GUI options here are comprehensive enough to poke it a few times.
This is hard to test because of the infrequency in which it happens, which is why I wanted a Python thing to log that it had happened and to "know" that our nudge fixed it. We're still in the discussion phase internally, I'd love to know what you ended up going with.