I've been seeing an error in our GeoEvent logs that states "ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN". The error is related to our server's message bus platform service. The error usually starts showing up after our server reboots but we sometimes are able to connect after a normal startup procedure. We are able to fix the issue by stopping the message bus and then restarting it but this process can and usually takes 5 or more attempts. We updated our startup routine so that the Gateway waits to start until two minutes after ArcGIS Server and GeoEvent waits until two minutes after the Gateway has started. The issue still persists after adjusting the startup procedure. These seems tricky to pinpoint as restarting the message bus enough times will resolve the issue. The logs are also throwing two errors related to an "unexpected connection driver error" and one from RabbitMQ that states "the trust manager trusts every certificate, effectively disabling peer verification". I've attached a snapshot of the logs. We are running 10.6.1.
Hi James Madden,
please make sure that you have installed the following patch: ArcGIS Server Security 2019 Update 2 Patch
This patch addresses the following issue:
BUG-000117633 - In 10.6.1 and prior, the message bus platform service may not be initialized correctly in all environments.
That should normally fix your issue. There have been some timing issues. This would also explain why it works some time when restarting it a few times.
Yes, I installed the patch and the system appears to be working properly now. I originally noticed that our services would break after the server got rebooted and our logs would blow-up with rabbitmq connection errors. I originally updated our shutdown and startup routines so that ArcGIS Server starts two minutes after the server reboots and the Gateway and Geoevent each start two minutes after their respective preceding component. We were also running a script to stop all services prior to rebooting and then turning everything back on once the reboot completed. These steps appeared to resolve the issue but then I noticed the services broke after my seventh reboot. I then did some additional digging and found the patch ESRI released. I installed the patch and reverted the startup routines to the default settings. The message bus was still throwing errors at this point but my email services were working properly. I made yet another update to the startup routine so that everything waits until 2 minutes after its preceeding component has started. This results in about a 6 minute startup process and I don't see any errors in the logs. Planning to run some additional tests this morning but the patch and delayed startup appear to have resolved the issue.