How to reconnect ArcGIS Server 10.7.1 to existing configuration store following network error?

4354
7
Jump to solution
12-06-2019 12:53 PM
NicolasPfeffer-Taggart1
New Contributor II

While I await technical support's response to my case (it's been a week, pleeeeeease help!!!)... I'm faced with an ArcGIS Server site that seems to have lost all knowledge/connection to it's configuration store and server directories, or at least that is my suspicion. The story goes like this; I have a single machine base deployment (Portal, Federated Hosting Server, Datastore) on a Windows Server 2016 VM (Azure) that I recently upgraded from v10.6.1 to v10.7.1. The upgrade went... slowly... Portal was by far the most cumbersome, but after many attempts, I made it out the other side and spent a good deal of time testing my site before opening it back up to my users.

Of course once my users started logging in, I was quickly informed that layers weren't loading in maps. Investigating, I discovered it started with those layers backed by our Workgroup Enterprise Geodatabase (SQL Express). Next it was the image services, we have a few that were copied to the server (i.e. not referenced). Through the first 5 minutes or so of errors the hosted feature layers seemed okay, but then the entire server site stopped responding to requests at the web adaptor (502 errors, my "web adaptor" is just an IIS URL Rewrite proxy) and directly at 6443... 

I logged in to my windows vm and found no windows error reports. I should point out that Portal was still functioning just fine at this point... So I restarted the ArcGIS Server Service. After a minute or two, the manager interface (via port 6443) came up giving me the option to create a new site or join and existing site!!!!! Like a fresh install...

I manually dug into the logs and found this at the start of the whole fiasco... seems to me like there was either a permissions issue or a network issue.

<Msg time="2019-11-30T18:52:02,70" type="SEVERE" code="6550" source="Admin" process="10364" thread="1" methodName="" machine="[machine name]" user="" elapsed="" requestID="">Failed to start the server machine '[machine name]'. Configuration store error. File system '\\[azure storage account].file.core.windows.net\[share name]\machines\[machine name].json' put failed. An unexpected network error occurred</Msg>

I navigated to the configuration store and found the [machine name].json file listed in the message, it was there but it was empty. 

Soooo, my question is... seeing as how the site had been functioning, I'm reaaaaaally hoping the configuration store isn't corrupt, and that all of this was brought on by some sort of network lapse... Is there any way to manually trick my ArcGIS Server site into picking up where it left off using the configuration store and directories as they are? I know I've done this once before at v10.5.1, but I don't recall exactly how. I think it involved creating a new site, stopping the ArcGIS Server Service, renaming the configuration store directory, then deleting some *.dat files in the old directory and starting the service again. I've seen similar posts by Jonathan Quinn describing this technique, but they referenced older versions so I don't want to proceed with muddling around... Obviously I'm hoping Esri tech support will get back to me asap, but if anyone has any ideas, I'm all ears! Thanks!

p.s. if anyone at esri is looking for my case number it is 02450878

0 Kudos
1 Solution

Accepted Solutions
NicolasPfeffer-Taggart1
New Contributor II

So I got an analyst at Tech Support to work through this issue (thanks Igb Akintade you rock). What I'm sharing here is not an endorsement, if you find yourself in a similar situation, I definitely recommend contacting tech support to get to the bottom of your issues. In my case, it does seem that there had been some sort of network issue that prevented server from reaching the configuration store, but after that issue was resolved I was still left with a site that couldn't initialize on the existing configuration store. The method for re-establishing the site went as follows:

  1. Stop ArcGIS Server Service
  2. manually make a copy your arcgis server configuration store directory and your server directories to a safe place (just in case)
  3. browse to the current configuration store in File Explorer (in my case an Azure File Share), search for and delete the following from within the configuration store:
    • all ".site" directories
    • all "*.rlock" files
    • all "*.wlock" files
  4. Ditto for the server directories (again, mine are on a separate Azure File Share)
  5. Start ArcGIS Server Service
  6. Browse to the https://MACHINE.DOMAIN.COM:6443/arcgis/manager URL
  7. Click the "Create New Site" button
  8. enter PSA account details (I happened to re-use my original, even though it had long since been disabled, not sure if it matters, or if you can use a new username/password)
  9. enter the path to the existing configuration store and directories
  10. Let it sit and churn for a bit (I had to carefully repeat steps 2 and 3 because I had missed a few .rlock files)
  11. If all goes well, and you don't get any error messages, your site should come back up.
  12. Just a few more steps, go to https://MACHINE.DOMAIN.COM:6443/arcgis/admin, sign in and navigate to Machines > [Machine Name] > sslcertificates and verify your original certificates are listed
  13. back up a level to [machine name] and verify that "Web server SSL Certificate" is set to the alias of the certificate you wish to use. If it is not, click edit and provide the appropriate certificate alias. I think I did this too hastily, give your time to chill for a bit before changing this setting
  14. Also check your web adapter (in my case, I have a legacy v10.5.1 ArcGIS Cloud Building URL rewrite scheme loaded into IIS doing all of the web adapting, which ceases to confuse things as both my portal and server sit at the same /arcgis context... but oh well, its somehow still working)
  15. Despite ArcGIS Server being restarted in the process of changing the web server ssl certificate, I still got 502 errors at my web adapter URLs and had to manually restart the ArcGIS Server service once more to get things working. Again, don't work too quickly, stop and get some coffee, then come back and restart the service...
  16. It's worthwhile to validate your servers hosting datastore after all this. Mine was working fine but kept throwing the message about "WARNING Server machine 'https://MACHINE.DOMAIN.COM:6443/arcgis/admin/generateToken' returned an error. 'Failed to log in. Invalid username or password specified.' until I re-ran configuredatastore.bat using the newly provided PSA account info... Then I disabled the PSA account and all is well

View solution in original post

7 Replies
WillBooth1
New Contributor III

Hi Nicholas,

I have not had a lot to do with ArcGIS Server running on Azure, but assume that configuration store location will also be stored in this file;

C:\Program Files\ArcGIS\Server\framework\etc\config-store-connection.xml

Does this file exist still and if you open it in Notepad does it look valid (it should not be empty)?

If the file is missing, empty or invalid, try restoring this from your most recent backup and restart the ArcGIS Server service.

If that file is there with valid content, then it will more likely be an issue with ArcGIS Server not being able to access the configuration store.

Do you have a working site on Azure that is OK? If so take a look at this xml file and see what a working site should be configured like and where the configuration store is normally accessed from.

Cheers

Will

0 Kudos
NicolasPfeffer-Taggart1
New Contributor II

Thanks Will,

Oddly enough, the ...config-store-connection.xml file was present and appeared valid. I've since compared it to what I had prior to the fix I implemented and they are identical. Nonetheless, I got things working and posted the details below. Thanks for looking into this one!

0 Kudos
NicolasPfeffer-Taggart1
New Contributor II

So I got an analyst at Tech Support to work through this issue (thanks Igb Akintade you rock). What I'm sharing here is not an endorsement, if you find yourself in a similar situation, I definitely recommend contacting tech support to get to the bottom of your issues. In my case, it does seem that there had been some sort of network issue that prevented server from reaching the configuration store, but after that issue was resolved I was still left with a site that couldn't initialize on the existing configuration store. The method for re-establishing the site went as follows:

  1. Stop ArcGIS Server Service
  2. manually make a copy your arcgis server configuration store directory and your server directories to a safe place (just in case)
  3. browse to the current configuration store in File Explorer (in my case an Azure File Share), search for and delete the following from within the configuration store:
    • all ".site" directories
    • all "*.rlock" files
    • all "*.wlock" files
  4. Ditto for the server directories (again, mine are on a separate Azure File Share)
  5. Start ArcGIS Server Service
  6. Browse to the https://MACHINE.DOMAIN.COM:6443/arcgis/manager URL
  7. Click the "Create New Site" button
  8. enter PSA account details (I happened to re-use my original, even though it had long since been disabled, not sure if it matters, or if you can use a new username/password)
  9. enter the path to the existing configuration store and directories
  10. Let it sit and churn for a bit (I had to carefully repeat steps 2 and 3 because I had missed a few .rlock files)
  11. If all goes well, and you don't get any error messages, your site should come back up.
  12. Just a few more steps, go to https://MACHINE.DOMAIN.COM:6443/arcgis/admin, sign in and navigate to Machines > [Machine Name] > sslcertificates and verify your original certificates are listed
  13. back up a level to [machine name] and verify that "Web server SSL Certificate" is set to the alias of the certificate you wish to use. If it is not, click edit and provide the appropriate certificate alias. I think I did this too hastily, give your time to chill for a bit before changing this setting
  14. Also check your web adapter (in my case, I have a legacy v10.5.1 ArcGIS Cloud Building URL rewrite scheme loaded into IIS doing all of the web adapting, which ceases to confuse things as both my portal and server sit at the same /arcgis context... but oh well, its somehow still working)
  15. Despite ArcGIS Server being restarted in the process of changing the web server ssl certificate, I still got 502 errors at my web adapter URLs and had to manually restart the ArcGIS Server service once more to get things working. Again, don't work too quickly, stop and get some coffee, then come back and restart the service...
  16. It's worthwhile to validate your servers hosting datastore after all this. Mine was working fine but kept throwing the message about "WARNING Server machine 'https://MACHINE.DOMAIN.COM:6443/arcgis/admin/generateToken' returned an error. 'Failed to log in. Invalid username or password specified.' until I re-ran configuredatastore.bat using the newly provided PSA account info... Then I disabled the PSA account and all is well
JTessier
Occasional Contributor II

Thank you so MUCH for recording this, this solved our problem as well, with a multi-machine AGS site (note we had to remove any references to the other machines in the clusters and machines folder as an additional step).  And esri tech support was not available at this early morning hour, but these notes were!  Would add several more kudos if I could!

0 Kudos
AlessandroValra
Occasional Contributor III

Nicolas,

Have you recently installed a windows update?
.net framework 4.8 in particular?
I'm in the same situation: ArcGIS 10.6 and Win server 2012 R2 installed...

unlike your case to me the file config-store-connection.xml is missing and once restored disappears again.

if you have any suggestions are welcome

0 Kudos
WillBooth1
New Contributor III

Hi Alessandro,

The config-store-connection.xml will be deleted if the ArcGIS Server can no longer connect to the configuration store.

So double check the connection parameters, that the service account still has the required access permissions to this resource and there is no network issue in the way.

Configure ArcGIS Server Account is an additional thing to try on the server, depending where the config store is.

Cheers

Will

AlessandroValra
Occasional Contributor III

Hi Will,

thanks for your response.

I first tried to recreate the config-store-connection.xml file, but at service restart it disappear again.

Also Configure ArcGIS Server Account process dont resolve the problem.

I've solved using the Solution B on this post  Error: Failed to create the site. Failed to create the service 'System/CachingTools.GPServer' 

Recreating a new site after a backup of original folders, and replace with bacup folders as described at 11 and 12 item list.

I couldn't figure out if the problem originated from the 4.8 framework, but in fact I found web adaptor not configured.

I hope this will help other users.

Thanks!

 

0 Kudos