Hi there,
I am seeing a strange issue with multi node ArcGIS Server sites in both 12.0 and 12.1 using the Azure cloud native pattern.
Test environment is vanilla Azure (no landing zones or Management groups / policies) using ArcGIS Enterprise Cloud Builder 12.0 for Azure to create a 12.0 deployment do some testing then upgrade to 12.1. The environment is:
What I see is this
Full config store is in Cosmos DB as expected
Both servers have a c:\arcgisserver\ local directory with \directories (arcgissystem etc.) \local and \logs (no additional data drives)
The first server has a .site folder in the C:\arcgisserver\directories\arcgissystem with the site DAT, the second server does not have this folder (I do not know if this is expected)
When publishing to the server the service is published correctly to both nodes in the site and the ArcSOC processes come up for dedicated services.
When deleting a service, it isn't removed from the one of the nodes
Full service definition is still in the local directory
Same behaviour when deleting another service
When looking in cosmos db the service definition is still present
Note the content elements is null but the cosmos db entry is still there.
Server logs describe the service being removed
No SEVERE or WARNING log entries to indicate the service deletion failed.
This causes issues when services are republished as one of the nodes now has a stale service definition. We have seen issues where a service with updated symbology from associated font files returns missing symbols as the App Gateway load balances between the server nodes and the second node where the service wasn't deleted returns the old service definition.
Note at 12.0 we have seen issues with publishing where services are NOT present on both nodes and the c:\arcgisserver\directories\arcgissystem\arcgisinput directory is different between the 2 ArcGIS Server nodes.
In this example node 2 was missing the entire Utilities folder and SampleworldCities as an example.
Has anyone come across this behaviour at 12.0 / 12.1? Currently our workaround is to remove the second node from the site. These have been added to support sites that require high throughput.
Thanks!
EDIT: Also note that performing a synchroniseWithSite operation on the second node does not remove the deleted service definitions / folders.
EDIT: When deleting a service via manager I can see the service only being deleted on one node:
EDIT: I have been doing a bitt of further testing and even though the service folders are left behind, a subsequent publish does appear to update them at 12.1 - I am confirming this against 12.0. It seems to be inconsistent as to with node actually has the service deleted.
EDIT: From a 12.0 multi node site when a service is deleted the service folder in \arcgisinput is usually left on the second node (fully deleted from node 1). If I republish the service *sometimes* the folder is deleted and replaced with the updated service and sometimes it isn't, leading to a mismatch service definition on node 1 and node 2. If this happens you have to delete the service then manually delete the left over folder from the second node before republishing.