Hi ESRI community,
Our ArcGIS Enterprise 11.3 upgrade failed because the hosting server encountered a Tomcat server.xml issue. To understand why it happens, here are some specific questions (refer to the Question section below). It would be much appreciated if someone could shed some light.
Background
Upgrade the existing production environment of ArcGIS Enterprise from 10.9.1 to 11.3.
Issue
After the above steps, a browser session was triggered, trying to connect to the ArcGIS Server Manager website to complete the upgrade, but it couldn’t be loaded. Tried https://localhost:6443/arcgis/manager/ or https://servername.xyzxy.local:6443/arcgis/manager, and both showed the same error in browser like below. It looks like the ArcGIS Server's Tomcat server is down.
Some facts to help diagnose
1. Windows Service shows ArcGIS Server Windows Service is running under Local System.
2. In the Task Manager, ArcGIS.exe is running with no ArcSOC.exe processes running.
3. Tomcat server.xml seems corrupted after the upgrade. For the last part of the file, the <Connector> section here is before VS after the upgrade.
4. The certificates folder “<ArcGIS Server Root>\Server\framework\etc\certificates” should contain three files, but it had six files after the upgrade, including one lock file for each file below.
5. An ESRI technical document https://support.esri.com/en-us/knowledge-base/how-to-fix-an-arcgis-keystore-or-server-xml-corruption... provided a ‘workaround’ to fix the problem. It worked. But I must copy the DEV environment keystore and certificates to the production one, which concerns me for future upgrades.
6. We have MS Defender working hard during the installation processes. For about 15 minutes, the installation showed ‘not responding’. After checking with our IT team, we found nothing abnormal was captured for the failed server in the Defender report.
Questions
1. During the installation of the ArcGIS Server, a domain service account was nominated to run it. But the Windows Service shows it is running under the Local System after installation. Is this expected? The 10.9.1 ArcGIS Server Windows service is running under the specified domain account.
2. Are there any known conflicts between MS Defender, the antivirus software and the ArcGIS server installation process?
3. If we rule out the low disk space factor on both the server and the configuration store server, what may have caused a corrupted server.xml while upgrading?
4. To avoid the corrupted server.xml issue, is there anything extra I should check before the upgrade?
5. If it happens again, what is the recommended exercise? If we use this workaround, what is the implication of copying the key store and DEV certificates to PROD, as the certificates are issued to different FQDNs?
I would appreciate any information on any of the above questions.
Thanks in advance.
Hua
Solved! Go to Solution.
Thanks @DavidColey. I did some research and contacted local ESRI support. I can confirm that your comments are well aligned with what I have learned so far. We just upgraded successfully on the 2nd attempt.
To answer my questions above.
Qusetion: During the installation of the ArcGIS Server, a domain service account was nominated to run it. But the Windows Service shows it runs under the Local System after installation. Is this expected? The 10.9.1 ArcGIS Server Windows service is running under the specified domain account.
A: As David mentioned, the Windows service for the ArcGIS Server should run under the nominated service account during the installation. A silently failing back to the Local System can be a permission issue. Make sure to run the installation file as an administrator, and the nominated service account should have full control over target ArcGIS-related file folders, e.g., C:\arcgisserver, and C:\Program FilesArcGIS (these folders can be different if using non-default configuration).
Question: Are there any known conflicts between MS Defender, the antivirus software and the ArcGIS server installation process?
A: ESRI recommends that you turn off anti-virus software or active scanning during installation. Anti-virus software will significantly slow down the installation and is change-resilient.
Question: If we rule out the low disk space factor on both the server and the configuration store server, what may have caused a corrupted server.xml while upgrading?
A: Not sure. Suspect a permission thing because we did find a permission issue.
Question: To avoid the corrupted server.xml issue, is there anything extra I should check before the upgrade?
A: Back the system as the bottom line, and key files e.g. tomcat conf, certificates folder and config store etc.
Question: If it happens again, what is the recommended exercise? If we use this workaround, what is the implication of copying the key store and DEV certificates to PROD, as the certificates are issued to different FQDNs?
A. The above workaround is still valid for 11.3.
Hello Hua - I'm not sure that an underscore in a domain service account name is allowed and that may be why the 'Log On As' from the services window is showing as running under Local System.
- or the directory permissions for the domain account are such that the domain account could not fully access all of Server's directories
- or it did not have full write permissions to create any new directories that 11.3 may need.
If the domain account is set up properly, the ArcGIS Server windows process should be showing that it is running from the account: e.g. domain\arcgisserviceaccount, not Local System or Local Service.
Documentation exists that also recommends, if possible, to suspend any active virus scanning just prior to the upgrade. This cuts the upgrade and post upgrade times down considerably, and if you remove the web adaptor prior to upgrade the chance of access is minimized.
Thanks @DavidColey. I did some research and contacted local ESRI support. I can confirm that your comments are well aligned with what I have learned so far. We just upgraded successfully on the 2nd attempt.
To answer my questions above.
Qusetion: During the installation of the ArcGIS Server, a domain service account was nominated to run it. But the Windows Service shows it runs under the Local System after installation. Is this expected? The 10.9.1 ArcGIS Server Windows service is running under the specified domain account.
A: As David mentioned, the Windows service for the ArcGIS Server should run under the nominated service account during the installation. A silently failing back to the Local System can be a permission issue. Make sure to run the installation file as an administrator, and the nominated service account should have full control over target ArcGIS-related file folders, e.g., C:\arcgisserver, and C:\Program FilesArcGIS (these folders can be different if using non-default configuration).
Question: Are there any known conflicts between MS Defender, the antivirus software and the ArcGIS server installation process?
A: ESRI recommends that you turn off anti-virus software or active scanning during installation. Anti-virus software will significantly slow down the installation and is change-resilient.
Question: If we rule out the low disk space factor on both the server and the configuration store server, what may have caused a corrupted server.xml while upgrading?
A: Not sure. Suspect a permission thing because we did find a permission issue.
Question: To avoid the corrupted server.xml issue, is there anything extra I should check before the upgrade?
A: Back the system as the bottom line, and key files e.g. tomcat conf, certificates folder and config store etc.
Question: If it happens again, what is the recommended exercise? If we use this workaround, what is the implication of copying the key store and DEV certificates to PROD, as the certificates are issued to different FQDNs?
A. The above workaround is still valid for 11.3.