ArcGIS Server Upgrade - 10.2.2 to 10.7.1: Some thoughts...

1011
3
04-01-2021 01:24 PM
berniejconnors
Occasional Contributor III

 

Here are some thoughts I would like to share following our recent upgrade from ArcGIS Server 10.2.2 to ArcGIS Server 10.7.1.  But first some background:

Our servers live in a magical realm called "IT Services land".  This magical realm is controlled by a powerful wizard and we are not allowed to have any direct access to our servers except through the ArcGIS Server Manager and ArcGIS Admin interfaces.  Whenever we need changes beyond the capabilities of those two interfaces we have to write a detailed incantation of magic called an RFC - Request For Change.  Fortunatley the wizard has a team of geniuses to recite our incantations of magic and most changes are executed accurately.  But the process is onerous and it does impede our ability to make changes.  Submitting an RFC to the magical realm is like force feeding rotten apples to a unicorn and expecting rainbows to come out the other end.  Now back to reality...

                Here is a graph of CPU activity from 2 pm the other day when we completed the cut-over to the new servers.  It shows our old 10.2.2 servers (pink and green) and our new 10.7.1 servers (blue, red, yellow).  Note there are two servers in our old system and three servers in our new system.  Its interesting to compare the CPU load between the two systems:

BernieConnors1_0-1617304381019.jpeg

Some points to note:

  • The list of map services in each system is identical (nearly identical)
  • Pre 2 pm the new servers are idling with CPU activity between 10% and 20% whereas the old servers are running under load between 20% and 30% CPU activity.
  • Post 2 pm the old servers idle at less than 5% CPU
  • Post 2 pm the 3 new servers require 30% to 50% CPU (10% to 20% more) activity to handle the same load previously running on 2 servers - and there are more spikes of high CPU activity.

This graph shows the same stats with a close-up of 40 minutes of activity around midnight:

BernieConnors1_1-1617304381034.jpeg

 

Some points to note:

  • Midnight is the default time for the map services to recycle.  Both systems show a spike of CPU activity at midnight:
    • The old servers can handle map service recycling with a peak of 50%.  The new servers peak at 100% when the map services are recycled and require more time to complete the recycling.
  • The three individual peaks in blue, red, and yellow show the activity from our stop / start python script that I manually activated about 12:05 am.  The spacing between the peaks show that we have a sufficient pause between the restart of one machine and stopping of the next machine – we always have two machines ready to respond during the stop / start process.

I know there have been changes in the software stack for ArcGIS Server and this is likely the cause for the change in performance.  ArcGIS Server 10.7.1 is much less efficient than 10.2.2 with our typical server load.  We know the labelling engine was changed at version 10.5(?).  Labelling is a big part of the difference in performance but we can also see map service recycling is less efficient with ArcGIS Server 10.7.1 - it takes more time and requires more CPU activity.  I am told we should expect better stability with 10.7.1 but I'll have to wait and see as we are less than one week into production with ArcGIS Server 10.7.1

We have not yet invested in monitoring tools for ArcGIS Server.  We are currently looking at options.  Monitoring is probably the largest gap in our server management practices.  The geniuses in the magical realm of IT Services Land have server monitoring tools but they cannot monitor the performance of a map service and we do not have access to their monitoring tools.  The graphs displayed above were delivered to me by email!  But I was impressed by how much information I could infer just from the CPU activity graph.  I am certain the time and effort we put into monitoring tools will give us great benefits.  If you have any recommendations for monitoring tools please share them.

Thanks,

Bernie.

0 Kudos
3 Replies
VinceAngelo
Esri Esteemed Contributor

Just because midnight is the default service restart time doesn't mean it needs to *stay* the service restart time. I ran a script to assign restart times in round-robin fashion with an interval 15 minutes between of hours of  0015-0445.

- V

berniejconnors
Occasional Contributor III

That's very true Vince.  We should stagger the recycle times but we also reboot the servers at 4am so what is the point of recycling the services??  It would probably be better to disable the recycling of map services and let the server reboot refresh our map serivces.

Bernie.

0 Kudos
VinceAngelo
Esri Esteemed Contributor

Daily reboots aren't really part of best practice.  Some of my clients have monthly "hardware" restarts, and if you want to be hard-core then weekly restarts could be scheduled, but the server recycle is supposed to obviate the need for *any* reboots.

- V

0 Kudos