Thanks for your reply @ChristopherPawlyszyn !
Well, it's not what I am experiencing so far: tileCache does not behave the same as relational regarding failover. Even if "failover_on_primary_stop" property is set, only the relational datastore failovers, the tileCache does not. It's not very surprising because in the file "datastore.properties", the property "failover_on_primary_stop" is in the section "Settings for relational data store".
Here is a summary of what I am systematically observing (tested at least 2, 3 times):
- On a brand new deployment, I can promote both standy "tileCache" or "relational" as Primary. In that case, shutting off the VM hosting the standby datastores is not an issue: ArcGIS Server can still connect to them on the primary and dataStore validation from ArcGIS Server works.
=> Perfect when the intervention is planned
- But if the primary "tileCache" datastore is taken offline, the standy is not promoted and validating the datastore from the ArcGIS Server manager fails. It does work with relational datastore.
And for the tileCache, then it's over nothing can be done:
=> Trying to validate it, returns the following error:
```
Could not connect to the ArcGIS component at URL 'https://PORTAL01.COMPANY.COM:2443/arcgis/datastoreadmin/machines/PORTAL02.COMPANY.COM/validate'. The ArcGIS component on that machine may not be running or the machine may not be reachable at this time.Error: connect timed out
```
Trying to promote it as Primary fails with the same error as well as it tries to communicate with the primary.
So what do we do if we have a serious issue with one node which at that time was hosting primary "tileCache" datastore and that it cannot be brought back online ?
Regarding the configuration, it is basic HA composed of 2 nodes:
Portal01: Windows Server 2019 + Portal for ArcGIS 10.8.1 + ArcGIS Server 10.8.1 + Datastore 10.8.1 (relational + tileCache)
Portal02: idem
Thanks !