Select to view content in your preferred language

Failed to get the configuration of the server machine...

9023
17
05-22-2014 06:58 AM
RoyceSimpson
Regular Contributor
Hi All,
For the past couple weeks we've been getting some AGS 10.2.2 errors that cripple our map services until I reboot the servers.  Have a look at the attached log screenshot.

We have two AGS servers that share their config-store and other directories on a network appliance.  I've configured both servers to use UNC pathing as in:  \\netapp\gisserverdata\arcgisserver\config-store.  This has been working great with no issues for well over a year.  I upgraded from 10.2 to 10.2.2 in early May and for the past couple weeks, we've been sporadically getting this "can't connect to the config-store" error, which in turn totally borks our map services.  The only remedy I've found is to reboot the two AGS servers.  The problem is resolved and everything works great, until it happens again... usually within a day or two.

I've reinstalled both ArcGIS Server software on the two servers to no avail.  I've checked numerous times with our sys admins to see if they are having connectivity issues with our network appliance with no issues there.

Thanks for helping out.
Tags (2)
0 Kudos
17 Replies
GISDev1
Occasional Contributor III
I know you said you checked with your admins, but still, the first thing I would check is some network activity logs during the time it starts to give this message, if it is really occurring on both ArcGIS Server boxes at the exact same time then it is almost guaranteed to be some kind of network issue with connecting to that network appliance. Maybe they recently started testing IPv6 or the DNS server was updated or anything along those lines. I would try an nslookup of that server (arete.lc.gov) from each ArcGIS Server box when it says it can't connect to the config-store.

You could probably test this hypothesis by switching to a local config-store for 1 week on both servers and see if the problem stops.
0 Kudos
RoyceSimpson
Regular Contributor
I know you said you checked with your admins, but still, the first thing I would check is some network activity logs during the time it starts to give this message, if it is really occurring on both ArcGIS Server boxes at the exact same time then it is almost guaranteed to be some kind of network issue with connecting to that network appliance. Maybe they recently started testing IPv6 or the DNS server was updated or anything along those lines. I would try an nslookup of that server (arete.lc.gov) from each ArcGIS Server box when it says it can't connect to the config-store.

You could probably test this hypothesis by switching to a local config-store for 1 week on both servers and see if the problem stops.


I think they have checked the network activity logs but I'll affirm that.   Also, the issue only seems to happen to one server or the other.  As you can see by that log screenshot, the affected server is "Arete".  The errors for that particular instance started at 8:00AM and not until 8:14 does the other server "Col" start throwing errors.

I've thought about doing the "local config-store" thing but am playing the waiting game for now.  If I do choose to put the config store locally, what is the procedure for that?  Just point to the new location in Manager and make sure the other server can see that location?  Or do I need to do something more drastic like copy the config-store folder from the netapp to the local space first, then point to that in Manager?
0 Kudos
GISDev1
Occasional Contributor III
I think they have checked the network activity logs but I'll affirm that.   Also, the issue only seems to happen to one server or the other.  As you can see by that log screenshot, the affected server is "Arete".  The errors for that particular instance started at 8:00AM and not until 8:14 does the other server "Col" start throwing errors.

I've thought about doing the "local config-store" thing but am playing the waiting game for now.  If I do choose to put the config store locally, what is the procedure for that?  Just point to the new location in Manager and make sure the other server can see that location?  Or do I need to do something more drastic like copy the config-store folder from the netapp to the local space first, then point to that in Manager?


I'd check the Windows Server Event Logs during the time of the communication errors as well. Might be some kind of permissions issue as well.

For changing the config-store location, I should have clarified a bit better; I was referring to having a local copy of the whole directory on each server, so in that case I would just copy-paste the whole config-store directory that is on the network location to each server on a local drive for each server.
0 Kudos
RoyceSimpson
Regular Contributor
I'd check the Windows Server Event Logs during the time of the communication errors as well. Might be some kind of permissions issue as well.

For changing the config-store location, I should have clarified a bit better; I was referring to having a local copy of the whole directory on each server, so in that case I would just copy-paste the whole config-store directory that is on the network location to each server on a local drive for each server.


I've checked the windows server event logs as well, with nothing unusual to report.

Regarding the config-store thing... wouldn't what you mentioned above break the site?  Don't both servers need to be able to interact with a shared config-store location?  If I copy that folder to each server locally and each server only sees the copy on their respective machines... wouldn't that break how "the site" works in terms of the two machines managing the shared services?
0 Kudos
GISDev1
Occasional Contributor III
I've checked the windows server event logs as well, with nothing unusual to report.

Regarding the config-store thing... wouldn't what you mentioned above break the site?  Don't both servers need to be able to interact with a shared config-store location?  If I copy that folder to each server locally and each server only sees the copy on their respective machines... wouldn't that break how "the site" works in terms of the two machines managing the shared services?


I've never actually tried the local thing myself, it was just an idea to check for network issues. I was thinking you wouldn't change any kind of configurations or publish/change any services during the time of the testing, in which case the config-store data would all stay the same, I think? In which case, would the 2 ArcGIS Servers still need to be able to modify the config-store directory? It could certainly break the site for a few minutes when you first test it, but you would know immediately I would think.
You are correct that this kind of thing is not supported or recommended though according to this documention.
http://resources.arcgis.com/en/help/main/10.1/index.html#/Expanding_from_one_GIS_server_to_multiple_...
0 Kudos
RoyceSimpson
Regular Contributor
I've never actually tried the local thing myself, it was just an idea to check for network issues. I was thinking you wouldn't change any kind of configurations or publish/change any services during the time of the testing, in which case the config-store data would all stay the same, I think? In which case, would the 2 ArcGIS Servers still need to be able to modify the config-store directory? It could certainly break the site for a few minutes when you first test it, but you would know immediately I would think.
You are correct that this kind of thing is not supported or recommended though according to this documention.
http://resources.arcgis.com/en/help/main/10.1/index.html#/Expanding_from_one_GIS_server_to_multiple_...


Along that line of reasoning... if the config-store is not being actively modified by map service changes or what have you, then it shouldn't be an issue in terms of the servers communicating with it at a time like the other morning at 8AM when nothing "change oriented" was going on.  However, I think that ArcGIS Server actively communicates with the config store on a regular basis and one of the outfalls of the servers not being able to communicate normally is this:  On every occasion that this has occurred, if I try to go to our map service rest endpoints such as "http://maps.larimer.org/arcgis/rest/services", I'm presented with an ArcGIS Server login screen.  Nothing I put in there for login/password works.  So, for some reason, when communications with the config-store go down, the rest services don't know who is or isn't allowed to see them and respond by putting up a login screen.

In any event, I do think that for testing purposes, if need be, I could copy the config-store folder to the c drive of one of the two servers, then repoint the config folder setting in Manager to that, making sure the other server can see the first server's C drive.  If all goes well with that setup, then I can assume there is some issue with communicating with the netapp.  If the same thing happens, then I can assume the issue is with ArcGIS Server itself.
0 Kudos
GISDev1
Occasional Contributor III
Along that line of reasoning... if the config-store is not being actively modified by map service changes or what have you, then it shouldn't be an issue in terms of the servers communicating with it at a time like the other morning at 8AM when nothing "change oriented" was going on.  However, I think that ArcGIS Server actively communicates with the config store on a regular basis and one of the outfalls of the servers not being able to communicate normally is this:  On every occasion that this has occurred, if I try to go to our map service rest endpoints such as "http://maps.larimer.org/arcgis/rest/services", I'm presented with an ArcGIS Server login screen.  Nothing I put in there for login/password works.  So, for some reason, when communications with the config-store go down, the rest services don't know who is or isn't allowed to see them and respond by putting up a login screen.

In any event, I do think that for testing purposes, if need be, I could copy the config-store folder to the c drive of one of the two servers, then repoint the config folder setting in Manager to that, making sure the other server can see the first server's C drive.  If all goes well with that setup, then I can assume there is some issue with communicating with the netapp.  If the same thing happens, then I can assume the issue is with ArcGIS Server itself.


That's a good idea. Definitely putting it on 1 server's local drive and mapping the other server to that 1st server's local drive will narrow it down considerably. If only the 2nd server is going down, then you'll know it's a network issue, but if both the 1st and 2nd are still having issues, you'll know it is some kind of problem local to both machines. And if it works well, then you can narrow it down the the netapp connection like you mentioned.
0 Kudos
MichaelVolz
Esteemed Contributor
Do you have a development environment that can mimic the setup of your production environment with multiple machines?  This is what my organization did so we could try to flush out multi-server environment issues in the development environment so it would not impact production users.  I would suggest getting EDN licenses for your development environment to save cost.
0 Kudos
RoyceSimpson
Regular Contributor
Do you have a development environment that can mimic the setup of your production environment with multiple machines?  This is what my organization did so we could try to flush out multi-server environment issues in the development environment so it would not impact production users.  I would suggest getting EDN licenses for your development environment to save cost.


Good point.  Yes we have a dev/test env setup with "identical" servers and AGS configurations except that their config store is \\netapp\gisserverdata\arcgisserver_test\.  We have seen no issues with that "test" ags site.
0 Kudos