ArcGIS Portal : Federated server - "failed to validate the server managed database"

13307
9
10-31-2018 04:50 AM
JustinClowes
New Contributor II

Suddenly we are having trouble with our Relational Datastore on ArcGIS Enterprise 10.5..

  • If you validate the Hosting server in Portal you receive "failed to validate the server managed database".
  • If you validate the Relational DataStore in ArcGIS Server it fails.
  • If you go to validate the DataStore in arcgis/admin then it all seems to be valid and ready for requests. 
  • If we restart the servers (ArcGIS and Portal) then the issue is temporarily resolved - the DataStore and Hosting server are valid. You can successfully publish hosted layers. After a while it goes wrong again.

Looking in the ArcGIS Server Manager logs we can see messages such as :

Service containing process crashed for 'MyService.MapServer'. Please see if an error report was generated in 'C:\arcgisserver\logs\SERVERNAME\errorreports'. To send an error report to Esri, compose an e-mail to ArcGISErrorReport@esri.com and attach the error report file.

Any ideas? Could it be a hosted layer in the DataStore which is causing the problem?

Any help would be great.

9 Replies
JakeSkinner
Esri Esteemed Contributor

Hey Justin,

I ran into this issue after updating my licensing for ArcGIS Server.  Running the ArcGIS Data Store updatelicense command resolved this issue:

ArcGIS Data Store command utility reference—ArcGIS Data Store (Windows) Installation Guide | ArcGIS ... 

Steps:

1.  Open command prompt as an Administrator on the ArcGIS Data Store server

2.  Set the work directory to <ArcGIS Data Store installation directory>\datastore\tools.  Ex:

cd C:\Program Files\ArcGIS\datastore\tools

3.  Then simply type updatelicense to run this command

JustinClowes
New Contributor II

thanks Jake,

my licence has not changed but I ran updatelicense anyway and I get the message "Update license failed for managed geodatabase 'db_71scs'".

I can't understand how the DataStore appears valid in the admin page (see below) but cannot be validated in ArcGIS Server Manager or Portal. Any ideas?

  {
  "datastore.release": "10.5.0.6491",
  "datastore.name": "ds_ztnjb5og",
  "datastore.replmethod": "ASYNC",
  "datastore.isConfigured": "true",
  "machines": [{
    "machine.overallhealth": "Healthy",
    "datastore.release": "10.5.0.6491",
    "datastore.release.configstore": "1.2",
    "platform": "Windows",
    "machine.isReachable": "true",
    "hostip": "10.144.8.184",
    "name": "GBMNC0-APP025.EUROPE.JACOBS.COM",
    "role": "PRIMARY",
    "dbport": 9876,
    "healthcheck.enable": "true",
    "status": "Started",
    "adminurl": "https://MYSERVER:2443/arcgis/datastoreadmin/",
    "db.isactive": "true",
    "db.isAccepting": "true",
    "db.isInRecovery": "false",
    "db.ActiveReplMethod": "NONE",
    "db.isManagedUserConnValid": "true",
    "datastore.release.pg": "9.3.12",
    "datastore.release.sde": "105000",
    "datastore.release.geometry": "1.12.1",
    "datastore.release.geometrylib": "1.12.1",
    "db.isSiteConnValid": "true"
  }],
  "datastore.release.configstore": "1.2",
  "datastore.release.geometry": "1.12.1",
  "datastore.release.geometrylib": "1.12.1",
  "datastore.release.sde": "105000",
  "datastore.release.pg": "9.3.12",
  "datastore.status": "Started",
  "datastore.isActiveHA": "false",
  "datastore.overallhealth": "Healthy",
  "datastore.lastfailover": -1,
  "datastore.lastbackup": 1540981652086,
  "datastore.isRegistered": "true",
  "datastore.hasValidServerConnection": "true",
  "datastore.validServerMachinesList": [{
    "machineName": "MYSERVER",
    "adminURL": "https://MYSERVER:6443/arcgis/admin"
  }],
  "owningSystemUrl": "https://mydomain/arcgis"
}     
0 Kudos
JonathanQuinn
Esri Notable Contributor

There are two ways to validate the ArcGIS Data Store, one is Server validates it, (validating the Data Store through Manager or the Admin API), and the other is Data Store validating itself, (which is what you've done to receive the response you did). I'm not exactly sure what Data Store is checking when it's validating itself, but the "trust" between Server and Data Store seems to be broken somehow. It's strange that it works for a while before failing. Are there more details about why it failed when you look at the Server logs?

AdamCottrell
Occasional Contributor II

Thanks Jake.  This worked for me.  Do not know why but it did.  Licenses should have been good. 

GangWang
New Contributor III

thanks! solution worked for me

0 Kudos
CodyMarsh
Occasional Contributor

Hello Justin,

Are your ArcGIS Server and ArcGIS Data Store on the same machine?

0 Kudos
JustinClowes
New Contributor II

Hi Cody, overnight the datastore seems to have sorted itself out and now appears valid in ArcGIS Server Manager. I'm still testing to see if everything is working but I'd like to establish what might have happened. ArcGIS Server (also the Hosting Server) and the Relational DataStore are both on the same machine. Is this causing problems? Would it be a resource issue?

thanks

0 Kudos
CodyMarsh
Occasional Contributor

Hello Justin,

   I am glad to hear that it is now working. More than likely this could either be a network issue(since it just started working) or a resource issue as you stated. ArcGIS Server and Data Store being on the same machine is normally not an issue but if they are ever competing for resources then that would explain the behavior we are seeing. 

   It is not uncommon to see behavior like this if for some reason the machine was being pushed updates or maintenance was doing done on the Server unbeknownst to you. I would say monitor the machine for a few more days to see if the issue comes back, or speak with your System Admins to see if there was anything pushed out during the down time.

0 Kudos
BillBott
Occasional Contributor

Hi Justin, 

Might be entirely unrelated but thought I'd share the unusual behavior I've seen which is strikingly similar to yours. 

Running a full stack on a single instance (portal/WA/server/DS/DB). All would be good one second, and the next everything would be in tatters. Server/Portal would turn unresponsive or start getting cert errors - everything just tanked. Try to sign-in with Pro and it sometimes gets in an endless loop with Portal or simply crash dump. 

Narrowed down part of the cause to the "Internet Connection Sharing (ICS)" Windows Service. This ICS was changing the FQDN suffix of the machine from something like "Machine.MYDOMAIN.COM" to "Machine.MSHOME.NET". The enterprise components would pick it up and reconfigure - Data Store squirrels the bogus FQDN name off into one of its Postgres databases to revisit at another inopportune time, oh yea - and it backs it up!

Googling "turn off ICS" will lead you to an array of sites that direct you to shut off network sharing on each network connection, but that's no good. Others will suggest stopping and setting the ICS service to "Disabled". Nope - it comes back alive.  

The real root cause is Windows Update Service - which conveniently enables a disabled ICS service, then starts it, causing a cascading series of errors - none of which play out exactly the same way twice, just to keep you guessing. I finally disabled the blasted thing (windows update) using the group policy editor -  then things started stabilizing once I rebooted, reinstalled web adaptors and made sure no trace of the rogue machine FQDN 

The moral of the story is ... if it seems to happen at "freak" times - check and see if coincidentally you just so happen to have some new windows updates ready to install? You might. 

Good luck!

.