Hi Justin, I have the exact same issue and have been trying to resolve this for the past 3 months with ESRI. It began around early January. After starting or stopping any service within Server Manger - I get the same log error as you do. This causes a cascading restart of all my ArcGIS Servers and disrupts the entire enterprise.
Setting the synchronization flag on the machine 'ServerA'. Failed to clear Soap handler cache. Could not connect to the ArcGIS component at URL 'http://ServerA:6080/arcgis/services/esriAdmin/cache/clear'. The ArcGIS component on that machine may not be running or the machine may not be reachable at this time.Error:
My site:
Three 10.6.1 ArcGIS Servers running on Window Server 2008 R2 on VM
32GB RAM and 4 CPU cores on each
Federated with Portal as the hosting server
Separate Portal Server - Separate Postgresql(Managed) Server - Separate Directories/Config Store/Cache File Server
Running about 147 server services
Outside firewall IIS 7
We have tried a variety of troubleshooting workflows:
The "oplocks" on the shares
Symantec Anti-virus scanning exceptions https://support.esri.com/en/technical-article/000012517
The SOC account(arcgis local account) added to Administrators group
Re-applied all user and file permissions and re-credential arcgis user
Verified single cluster mode and reduced cluster to "one" machine
Provided all ArcGIS Server, OS and Tomcat logs/updates to ESRI Development team - no answer
Repaired all 10.6.1 installs
Set system server properties AppServer to "Optimized"
Verified all OS and ArcGIS service packs
Verified SOAP on all machines https://<<machine name>>:6443/arcgis/services?wsdl
Resetting the handlers and caches within Admin tools
Any progress or work-around would be appreciated.