How should we best organise data for an Enterprise solution?

813
6
Jump to solution
08-21-2014 12:59 AM
Highlighted
Occasional Contributor

Hi all,

I am in the process of designing an enterprise solution for our GIS. Previously we were doing everything in file geodatabases and it was hard to get the right data out to the right people at the right time and make our GIS data available for the whole business to use in support of their daily activities.

Speaking with our GIS manager i have come up with a design, which includes multiple databases and several different schemas within these databases. What i want to do is make sure i design a solution that is in-line with current best practices. I know the multiple database model is now unsupported in ESRI post 10.1 / 10.2. How are people organising a really large GIS now? Are there any good best practice guidelines anywhere? Should i be looking at multiple schemas and multiple databases or is it ok to just put everything in one big pot?

I thought i had this sussed, then started to read some of the guidance in the ESRI help and began to doubt myself. I reliase this question is quite open ended and there are potentially lots of options, but any guidance would be gratefully recieved!!

Reply
0 Kudos
1 Solution

Accepted Solutions
Highlighted
Occasional Contributor

thanks all for the input into this. I raised a call with ESRI Uk support and spoke to Simon who was extremely helpful.

He explained that the multiple database model that was no longer supported actually realetd to having multiple databases on one instance but ONLY ONE of thos databases had the SDE schema loaded into it. That is not the design i was looking at we are looking at 4 SQL SDE Databases and also a flat file store for Raster imagery with a view to creating mosaics from that iaw best practice (thanks @Vince Angleo, you were right and we did have that covered, just didnt explain in my original post).

When i explained the design to Simon he felt it was workable so we are going with it.

thanks for all your responses.

View solution in original post

Reply
0 Kudos
6 Replies
Highlighted
Regular Contributor

Hi Graham,

There is no one correct answer for this. I'm sure there are going to be multiple views, I'll put my view here.

Definitely I would't recommend using multiple databases, particularly if you are building an enterprise system. If you have multiple databases you are multiplying your maintenance cost. It's going to be difficult to integrate your spatial data across database like creating spatial views.

Having multiple schema within same database is a good option, particularly if you want to segregate your data logically and have access control at DB tier.

One important thing to consider when you are building an enterprise geodatabase is making wise use of dbtune file, never use the default dbtune file, create your own dbtune file with configuration keywords specific for feature classes, particularly helpful in storing feature classes in separate table spaces. This way you can provide a bigger buffer pool for frequently used feature classes.

Follow the best practices recommended by the database vendor as you would build a normal database, putting it simply geodatabase is nothing but a database with a spatial column and object id managed by ArcSDE.

Highlighted
Occasional Contributor

Hi Riyas,

Thanks for replying. Very useful info. I did go on an ESRI course for Data management in the multi-user database and we covered off DbTune, although i had forgotten about that until i just dug out my notes. The course was on 9.3 (i know! it's taken us along time to get from planning to do this to actually doing it! ) but i guess the principles havent changed too much.

The multiple database thing was my biggest concern after i read the ESRI docs today. We have planned on having 4 dbs on 1 sql server instance. I am assuming this is a massive no-no. Should we be mixing Raster and Vector in the same Db? Does that matter? We are planning on using Raster files and mosaicking for Raster imagery but we will have other raster that we wont manage in this way.

Reply
0 Kudos
Highlighted
Regular Contributor

Hi Graham,

My knowledge on SQL Server is limited, so I'll try to answer your question in oracle and DB2 context.

Oracle (non-RAC):

In the case of oracle one instance can have one and only database at a time. When you create an oracle DB instance you specify SGA and PGA, essentially these are memory reservations for the database. These reserved memory are available only to this instance. You create another instance on the same server you need to reserve memory for that instance as well. Not all of this reserved memory is used by an instance all the time. Instance will use all of this memory at peak load, when you are most likely to see performance issues.

If you have multiple instances on same server you will end up reserving memory for these instance overheads, which can be used for buffer pools which is one of the most important factor for DB performance.

DB2:

In the case of DB2 you can have one instance and multiple databases. You have memory reservation for the instance and memory reservation for each database. If you have multiple database you'll end up reserving some memory for each database which potentially goes unused (unless there is peak load) which can be used for buffer pool.

When you store your vector and raster datasets in the same database, they are going to sit in different table spaces of their own. Table space and buffer pool are the most important factors for a DB performance.

With above information in mind, If you have funds for one DB server (hardware + software + hosting + support), you are better off having one database for storing both your vector and raster datasets. As long as you tune your database as good as possible, which is essentially maximum utilisation of your disk IO and system memory.

If you have sufficient fund storing them in two physical server is ideal.

I have discussed some advanced DBA topic above, so it is important for you to know that I'm primarily a developer (with exposure to database management) and not a professional DBA.

Highlighted
Esri Esteemed Contributor

It's been a long time since 9.x days. Best practice for rasters is now to use image files with a mosaic dataset in a FGDB, not clog up the database (and backup space) with only nominally accessible pages.

It only makes sense to use different databases if you have no intention to use the tables together (e.g, production, test, and development).

- V

Highlighted
Regular Contributor

I would agree with Riyas that multiple schemas within same database is a good option.

From a practical perspective, I have worked for big agencies that put everything into a single schema and it becomes a real pain for users to locate data - even when you know what layer you are after. Imaging the ArcMap Add Data dialog with 500+ data sets listed. This also slows things down when just listing the data for remote network users.

The ability to search your data using metadata becomes critical with a large number of data sets so make sure metadata gets factored into your design.

Highlighted
Occasional Contributor

thanks all for the input into this. I raised a call with ESRI Uk support and spoke to Simon who was extremely helpful.

He explained that the multiple database model that was no longer supported actually realetd to having multiple databases on one instance but ONLY ONE of thos databases had the SDE schema loaded into it. That is not the design i was looking at we are looking at 4 SQL SDE Databases and also a flat file store for Raster imagery with a view to creating mosaics from that iaw best practice (thanks @Vince Angleo, you were right and we did have that covered, just didnt explain in my original post).

When i explained the design to Simon he felt it was workable so we are going with it.

thanks for all your responses.

View solution in original post

Reply
0 Kudos