"Stopped" ArcGIS Server node still sending messages to ArcGIS Server logs

144
2
03-26-2024 10:20 AM
berniejconnors
Occasional Contributor III

Hello - this has puzzled me for a long time.

We have three ArcGIS Server nodes sharing the load for our map services.  In ArcGIS Server Manager > Site > Machines I have stopped one of the nodes:

stopped_node.png

The health check (:6080/arcgis/rest/info/healthcheck) for the stopped machine returns a 404 error:

health404.png

However, If I look at the ArcGIS Server logs through the Admin interface (:6080/arcgis/admin/logs/query) I can still see log mesages from the stopped node.  Why doesn't the web adaptor recognize that this node is unhealthy and direct all of the requests to the other 2 nodes?  Looking at the last 1000 log messages, roughly 1/3 of the messages are generated by the stopped node.

Thanks,

Bernie.

0 Kudos
2 Replies
TimoT
by
New Contributor III

From my experience, web adaptors aren't 'smart' load balancers. They simply round-robin requests to all machines in its site regardless of the health status of each machine. Esri documentation seems to state otherwise though.

My testing/memories may be outdated as I have not used web adaptors for over a year since I do not have a requirement for web-tier authentication. I have been using internal and external third party load balancers instead.

Timo

0 Kudos
berniejconnors
Occasional Contributor III

Timo,

        Thanks for your reply.  I thought this question would have generated more responses.  Which third party load balancers have you been using?  What rules did you put in place to control traffic flow?

        Our server admins agree with you that the web adaptors are not very 'smart'.  They are supposed to monitor the ArcGIS Server Health status (/arcgis/rest/info/healthcheck) to determine which machines are ready to respond to a request.  Stopping a machine only stops the appserver within ArcGIS Server.  I did get some good info from Esri Canada Support:

  • When stopping the ArcGIS Server, this operation only starts/stop the appserver on the specific ArcGIS Server machine. Stopping the appserver renders services unavailable via REST, but any publishing event will automatically – and by design - attempt to restart the appserver and all the services on the specific ArcGIS Server machine. In essence, the start/stop machine operation does not provide a reliable means for completely stopping an ArcGIS Server machine. This can be problematic when attempting to stop a machine for routine maintenance or during scenarios where ArcGIS Server should not receive any traffic.

    As part of the 10.7 release, the "Under Maintenance" flag was added to help designate ArcGIS Server machines as being under maintenance and restrict traffic accordingly. This flag is available via the Edit Machine endpoint within the Server Admin API - https://developers.arcgis.com/rest/enterprise-administration/server/editmachine.htm.

 

  • A machine under maintenance will still honor administrative changes and publishing events made to the site through other ArcGIS Server machines. You can implement similar logic in your third-party load balancer or reverse proxy server to avoid forwarding service requests to machines that fail the health check. This allows you to make changes to the machine (such as updating its OS) without causing service requests to fail. When you are done performing maintenance on the machine, change this property back to false.

 

    I have experimented with the "Under Maintenance" flag and it appears to stop the web adaptors from sending traffic to an ArcGIS Server machine.  I was told the web adaptors do a health check about once every minute but I have not verified that.  Here is what the healthcheck will display when the "Under Maintenance" flag is set (left) vs. when a machine is stopped (right) on version 10.7.1:

under_maintenance.pngstopped_machine.png

    In my opinion, both situations should cause the web adaptors to stop sending traffic to an ArcGIS Server machine.  

    Our network infrastructure includes several hardware network load balancers (NLB).  For our next major upgrade we will replace the web adaptors with the NLB.

 

Bernie.

0 Kudos