Hello - this has puzzled me for a long time.
We have three ArcGIS Server nodes sharing the load for our map services. In ArcGIS Server Manager > Site > Machines I have stopped one of the nodes:
The health check (:6080/arcgis/rest/info/healthcheck) for the stopped machine returns a 404 error:
However, If I look at the ArcGIS Server logs through the Admin interface (:6080/arcgis/admin/logs/query) I can still see log mesages from the stopped node. Why doesn't the web adaptor recognize that this node is unhealthy and direct all of the requests to the other 2 nodes? Looking at the last 1000 log messages, roughly 1/3 of the messages are generated by the stopped node.
Thanks,
Bernie.
From my experience, web adaptors aren't 'smart' load balancers. They simply round-robin requests to all machines in its site regardless of the health status of each machine. Esri documentation seems to state otherwise though.
My testing/memories may be outdated as I have not used web adaptors for over a year since I do not have a requirement for web-tier authentication. I have been using internal and external third party load balancers instead.
Timo
Timo,
Thanks for your reply. I thought this question would have generated more responses. Which third party load balancers have you been using? What rules did you put in place to control traffic flow?
Our server admins agree with you that the web adaptors are not very 'smart'. They are supposed to monitor the ArcGIS Server Health status (/arcgis/rest/info/healthcheck) to determine which machines are ready to respond to a request. Stopping a machine only stops the appserver within ArcGIS Server. I did get some good info from Esri Canada Support:
I have experimented with the "Under Maintenance" flag and it appears to stop the web adaptors from sending traffic to an ArcGIS Server machine. I was told the web adaptors do a health check about once every minute but I have not verified that. Here is what the healthcheck will display when the "Under Maintenance" flag is set (left) vs. when a machine is stopped (right) on version 10.7.1:
In my opinion, both situations should cause the web adaptors to stop sending traffic to an ArcGIS Server machine.
Our network infrastructure includes several hardware network load balancers (NLB). For our next major upgrade we will replace the web adaptors with the NLB.
Bernie.