WebGISDR failing for Full recovery

3566
24
07-05-2018 09:51 AM
SzymonPiskula1
New Contributor III

Hello,

I have a fairly complex 10.6 setup on multiple windows 2016 with all elements being HA/duplicated:

-Portal

-Hosting AGS Site

--Separate Relational DS cluster

--Separate TileCache DS cluster

-Federated AGS Site

Everything works fine on it, stuff gets published, users can deal with it, hosted layers get created, map services work, all good.

Yesterday i have successfully created a full webgisdr backup (including tilecaches). No errors upon creation of the backup all looks good. Today i was going to test if webgisdr tool works/verify my DR procedure. Sadly upon calling the webgisdr tool with --import option the recovery fails with message:

Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.

Nothing has changed in terms of machines in any of the sites/nodes. All machine names are the same, Load Balancer URLs are the same. Yet it feels like the message indicates a mismatch? 

Here is the console output when i run the tool:

==================================================
Starting the WebGIS DR utility.
==================================================

The configuration and base backup time in the current Web GIS
-------------------------------------------------------------
Portal: https://portal.<site>/portal
|
|-- Federated Server: https://mapping.<site>/server
|
|-- Hosting Server: https://hosted.<site>/server
| |
| |-- Relational Data Store: https://ec2amaz-<relational-id>.mydomain.local:2443/arcgis
| |
| |-- TileCache Data Store: https://ec2amaz-<tilecache-id>.mydomain.local:2443/arcgis

Unzipping the backup file:
\\<backups>\July-4-2018-11-29-39-AM-EDT-FULL.webgissite

The backup file has been unzipped in 00hr:12min:15sec.

The backup file was created at July 4, 2018 11:29:39 AM EDT.

The configuration and base backup time in the incoming Web GIS
--------------------------------------------------------------
Portal: https://portal.<site>/portal at 7/4/18 11:24 AM
|
|-- Federated Server: https://mapping.<site>/server at 7/4/18 11:24 AM
|
|-- Hosting Server: https://hosted.<site>/server at 7/4/18 11:24 AM
| |
| |-- TileCache Data Store: https://ec2amaz-<tilecache-id>.mydomain.local:2443/arcgis
| |
| |-- Relational Data Store: https://ec2amaz-<relational-id>.mydomain.local:2443/arcgis


Starting the restore process with the WebGIS DR utility.

Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.

Exiting the WebGIS DR utility.

Here is the 'crucial' section of the additional log file in DEBUG mode:

2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection [id: 144][route: {s}->https://hosted.<site>:443] can be kept alive indefinitely
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection released: [id: 144][route: {s}->https://hosted.<site>:443][total kept alive: 1; route allocated: 1 of 2; total allocated: 1 of 20]
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager is shutting down
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-144: Close connection
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.DefaultManagedHttpClientConnection - http-outgoing-144: Close connection
2018-07-05 08:36:55 DEBUG [main] org.apache.http.impl.conn.PoolingHttpClientConnectionManager - Connection manager shut down
2018-07-05 08:36:55 DEBUG [main] com.esri.arcgis.webgis.util.WebGISUtil - {"incrementalBackupTimeStamp":1530770427265,"backupMode":"FULL","incrementalRestoreTimeStamp":0,"fullBackupTimeStamp":1530789177251,"fullRestoreTimeStamp":0}
2018-07-05 08:36:55 INFO [main] com.esri.arcgis.webgis.storageservice.file.FileStorageService - Unzipping the backup file:
\\<backups>\July-4-2018-11-29-39-AM-EDT-FULL.webgissite
2018-07-05 08:49:10 INFO [main] com.esri.arcgis.webgis.util.WebGISUtil - The backup file has been unzipped in 00hr:12min:15sec.
2018-07-05 08:49:10 INFO [main] com.esri.arcgis.webgis.service.impl.WebGISDRFrontController - The backup file was created at July 4, 2018 11:29:39 AM EDT.
2018-07-05 08:49:10 DEBUG [main] com.esri.arcgis.webgis.service.impl.WebGISDRFrontController - Failed to validate the current Web GIS.
com.esri.arcgis.webgis.WebGISException: Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.a(WebGISDRFrontController.java:818)
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.a(WebGISDRFrontController.java:221)
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.service(WebGISDRFrontController.java:103)
at com.esri.arcgis.webgis.service.impl.WebGISDRManager.c(WebGISDRManager.java:142)
at com.esri.arcgis.webgis.service.impl.WebGISDRManager.importWebGIS(WebGISDRManager.java:125)
at com.esri.arcgis.webgis.client.WebGISDR.main(WebGISDR.java:103)
2018-07-05 08:49:10 DEBUG [main] com.esri.arcgis.webgis.service.impl.WebGISDRFrontController - Deleting the temp directory \\<share>\tempbackups\WebGISSite1530794203646.
2018-07-05 08:49:12 DEBUG [main] com.esri.arcgis.webgis.client.WebGISDR - Exiting the WebGIS DR utility.
com.esri.arcgis.webgis.WebGISException: Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.a(WebGISDRFrontController.java:224)
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.service(WebGISDRFrontController.java:103)
at com.esri.arcgis.webgis.service.impl.WebGISDRManager.c(WebGISDRManager.java:142)
at com.esri.arcgis.webgis.service.impl.WebGISDRManager.importWebGIS(WebGISDRManager.java:125)
at com.esri.arcgis.webgis.client.WebGISDR.main(WebGISDR.java:103)
Caused by: com.esri.arcgis.webgis.WebGISException: Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.a(WebGISDRFrontController.java:818)
at com.esri.arcgis.webgis.service.impl.WebGISDRFrontController.a(WebGISDRFrontController.java:221)
... 4 common frames omitted
2018-07-05 08:49:12 ERROR [main] com.esri.arcgis.webgis.client.WebGISDR - Failed to validate the ArcGIS Data Store for the web GIS. The ArcGIS Data Store in the web GIS backup does not match the ArcGIS Data Store in the current web GIS.

Regards,

Szymon

24 Replies
JonathanQuinn
Esri Frequent Contributor

When you say you have separate relational and tile cache clusters, are they on separate machines? It doesn't appear so from your output. Let's say you have ds1 and ds2 in your primary site, does each machine have both the relational and tile cache data stores registered? Or do you have 4 Data Store machines, two for the relational and two for the tile cache? I assume this is duplicated in your DR environment? Which once is listed as primary for relational and which one is listed as primary for tile cache? Do the roles match between the environments, (for example, DS1 has both relational and tile cache and is the primary, DS2 has both relational and tile cache and is the standby and the same in DR).

0 Kudos
SzymonPiskula1
New Contributor III

Hi Jonathan,

I took the DR Backup of my Test environment, and i wanted to play it back onto the same Test environment. Thats when i hit the problem. Because this was taken and played back on the same environment and the same machines i cant tell where the mismatch would come. I was not testing a playback to my DR environmnet, it all took place on the same environment with exactly same machines.

There are in total 4 DS machines: 2xRelational(Primary+Standby) +2xTileCache(Primary+Standby). Each machine has only _one_ type of datastore. So on a machine with Relational store there is no TileCache and the other way around too.

This ouput stated above:

| |-- TileCache Data Store: https://ec2amaz-<tilecache-id>.mydomain.local:2443/arcgis
| | 
| |-- Relational Data Store: https://ec2amaz-<relational-id>.mydomain.local:2443/arcgis

Illustrates distribution of the DS machines. These are the machine names of the primary nodes. These are all separate machines.

 

Regards

0 Kudos
JonathanQuinn
Esri Frequent Contributor

I can repro this on my end. Looks like there's some invalid logic somewhere when we are checking the Data Stores in the backup and target site. Have you contacted Support about this? I'd have them log a bug.

A workaround could be to move the tile cache Data Stores to the same machine as the relational Data Stores.

SzymonPiskula1
New Contributor III

Thanks, this has been raised with the support. They are aware of this thread too.

0 Kudos
JonathanQuinn
Esri Frequent Contributor

This is already logged:

BUG-000112342 The webgisdr incremental restore fails when Geo Analytics Server is federated and registered with Portal as the Geo Analytics Server.

The synopsis isn't exactly what you have configured, but through that bug the DS validation has been updated, which fixes your issue as well.

QaiserHassan_Mohammad
New Contributor II

Hi Jonathan,

Any solution for this error available? We have simple HA setup (2 Portal, 2 Server federated + hosted, 2 DS, 2 WA) which is working perfectly. We don't have Geo Analytics Server and we are not backing up incrementally.

Webgisdr utility backed up the site (full), however, recovery is failing without any error.

Kindly advise.

Regards,

Qaiser

0 Kudos
JonathanQuinn
Esri Frequent Contributor

What do you mean it fails without an error? Can you provide a screenshot of the output of the tool?

0 Kudos
Harald_ØysteinLund1
Esri Contributor

I believe the bug isn't the same we experience here with HA deployment. We have the exactly same problem as Szymon describes it.

0 Kudos
Harald_ØysteinLund1
Esri Contributor

Hi,

We have the exact same problem. Our setup is ArcGIS Enterprise 10.6 High Availability Deployment (2 Portal, 2 ArcGIS Host, 2 ArcGIS Server, 2 Geoevent, 2 Datastore and SQL Servere as RDBMS for ArcGIS GeoDatabase). All setup with load balancer for the federated servers  Configurations and other common files is stored on a High Available NAS.

This is really critical for us that this will work, so my question is: Will snapshots of the servers be a good workaround for not using WebGISDR since all configurations is on High Available NAS?

0 Kudos