I have recently installed and configured ArcGIS Enterprise 10.6 with the intention of using it to replace the current Flex for ArcGIS applications ( I know, I inherited it). At the same time we are performing a database audit to clear down the number of out of data or redundant data held in the Enterprise geodatabase.
As a local authority we store a lot of third-party data from the Environment Agency, Thames Water or other Local Authorities. This data is for viewing or analysing and not updated by us. We also have a number of data sets which were generated here and are updated or a regular or semi regular basis.
As part of this revamp of our GIS infrastructure and data, I am considering loading all the third-party data into Data Store and keeping our layers in the eGDB. Is this a valid option? What are the benefits/drawback of Data Store vs eGDB? The main benefit I can see is it would make it easier to organise and update third-party data.
I'm looking at the same scenario as you but not certain how easy it will be to update the data within the DataStore. What method are you looking at for the data update (eg. REST API)?
It would depend on the data source. If the third-party provided the data via a web service (Feature Service, WMS, WFS) we would simply plug that into Portal. Data provided in file formats would would either download and overwrite the feature service using the Portal GUI or we would use FME.
The relational Data Store in Enterprise is used to store the data of the hosted feature services published to your portal (What is ArcGIS Data Store?—Portal for ArcGIS (10.6) | ArcGIS Enterprise). You can overwrite the data simply by overwriting the service on publication. To automate processes you could consider Scripting with ArcGIS Python API—Portal for ArcGIS (10.6) | ArcGIS Enterprise .
the simple answer, is that you should at a minimum, install ArcGIS Data Store with ANY ArcGIS Enterprise Deployment.
First and foremost, I'll echo Jacob's answer that this piece of software should be installed, but I'll give a bit more information from my experience and perspective.
SO! This is a fairly nuanced question that requires a bit more information about your deployment that I don't have, but I'll give it a go
So the AGS data store is a postgresql database that is designed to work 'on the backend' behind the Enterprise deployment to complete Enterprise jobs, store analyses output, allow for uploads to Portal, sync replicas, allow for an optimized option to 'upload data to server when publishing', etc. It's intention is not to have direct connects, ETLs, or to serve as a replacement for an enterprise GDB.
Now, with that said, things can be done to circumvent this and to use the ADS as an extension of your geodata enterprise deployment. Starting at 10.4.1, a lot of the safeguards were removed from connecting directly to your postgres deployment. You no longer have to edit some of the core postgres files to establish connections, you can access the postgres executables pretty easily, etc.
So, with a little modification, you can establish connections to the postgres database directly using the credentials of the built-in power user of the ADS. You can also launch PGAdmin to run administrative tasks but doing so is risky, as, again, the software was not intended as a "hands on" database and more of a black box. Once you establish a trusted connection (through the master account) you can then create a reader-user who can just read the database contents (and not edit...exposing editing to the masses... bad idea...). You can then make an .sde connection file like any other to the ADS with the credentials of the reader-account and use the data in your desktop clients.
I'd STRONGLY RECOMMEND NOT BUILDING ETLs into the ADS, especially not if they require deploying triggers, packages, etc into the native database.
Depending on the frequency of updates to these external datasets, ADS could be a solution to resolve the clutter in your enterprise database. However, another option is to deploy another enterprise geodatabase within your database instance that houses your two 'sets' of data (internally derived & externally derived). You then can build Search Services to properly catalog and discover content.
Sorry for the lack of links -- currently in the passenger seat of a van
Thank you to everyone who has replied so far. I feel it necessary to add more context and answer a few of the comment posted so far.
Jacob Boyle My question wasn't about whether I should install ADS, more how best to utilize it as part of the deployment. The crux of my question is, are there any advantages/disadvantages of storing data in the ADS instead of an eGDB? In the first instance it would be third-party data that we consume but do not edit.
Andrew Valenski (IT) Thank you for your detailed reply. Personally, I wasn't considering accessing the PostgreSQL database behind ADS directly. I was going to update the data using the web GUI, the REST API or FME depending on the data source. As you say, by using ADS and Portal as a data catalog it would declutter the eGDB. The main advantage I can see here is it would be easier to organize and maintain the data. My main concern would be how this may effect performance as each layer (or group of layers) in data store would have to be published as a feature service.
djmcdermott I would think of ADS as a user cache. I would continue to maintain your eGDBs, even ones that are read-only. ADS is limited by Feature Service limits(1000 features returned) and you'll also be limited in the symbology that you can use when compared to a map service.