Here are some thoughts I would like to share following our recent upgrade from ArcGIS Server 10.2.2 to ArcGIS Server 10.7.1. But first some background:
Our servers live in a magical realm called "IT Services land". This magical realm is controlled by a powerful wizard and we are not allowed to have any direct access to our servers except through the ArcGIS Server Manager and ArcGIS Admin interfaces. Whenever we need changes beyond the capabilities of those two interfaces we have to write a detailed incantation of magic called an RFC - Request For Change. Fortunatley the wizard has a team of geniuses to recite our incantations of magic and most changes are executed accurately. But the process is onerous and it does impede our ability to make changes. Submitting an RFC to the magical realm is like force feeding rotten apples to a unicorn and expecting rainbows to come out the other end. Now back to reality...
Here is a graph of CPU activity from 2 pm the other day when we completed the cut-over to the new servers. It shows our old 10.2.2 servers (pink and green) and our new 10.7.1 servers (blue, red, yellow). Note there are two servers in our old system and three servers in our new system. Its interesting to compare the CPU load between the two systems:
Some points to note:
This graph shows the same stats with a close-up of 40 minutes of activity around midnight:
Some points to note:
I know there have been changes in the software stack for ArcGIS Server and this is likely the cause for the change in performance. ArcGIS Server 10.7.1 is much less efficient than 10.2.2 with our typical server load. We know the labelling engine was changed at version 10.5(?). Labelling is a big part of the difference in performance but we can also see map service recycling is less efficient with ArcGIS Server 10.7.1 - it takes more time and requires more CPU activity. I am told we should expect better stability with 10.7.1 but I'll have to wait and see as we are less than one week into production with ArcGIS Server 10.7.1
We have not yet invested in monitoring tools for ArcGIS Server. We are currently looking at options. Monitoring is probably the largest gap in our server management practices. The geniuses in the magical realm of IT Services Land have server monitoring tools but they cannot monitor the performance of a map service and we do not have access to their monitoring tools. The graphs displayed above were delivered to me by email! But I was impressed by how much information I could infer just from the CPU activity graph. I am certain the time and effort we put into monitoring tools will give us great benefits. If you have any recommendations for monitoring tools please share them.
Just because midnight is the default service restart time doesn't mean it needs to *stay* the service restart time. I ran a script to assign restart times in round-robin fashion with an interval 15 minutes between of hours of 0015-0445.
That's very true Vince. We should stagger the recycle times but we also reboot the servers at 4am so what is the point of recycling the services?? It would probably be better to disable the recycling of map services and let the server reboot refresh our map serivces.
Daily reboots aren't really part of best practice. Some of my clients have monthly "hardware" restarts, and if you want to be hard-core then weekly restarts could be scheduled, but the server recycle is supposed to obviate the need for *any* reboots.