Choosing between a Single or Many Geodatabaes

ShaharLevenson · ‎04-29-2013

Hello all,

I've been thinking about the "right" way to work with geodatabases:

I'm a GIS analyst and my everyday work consists of delivering different GIS products (Maps, Geographic calculations, etc.) for different projects. The data for the products also comes from different sources.

I am currently making a file geodatabase for each product that contains the results and the temporary layers. That FGDB resides within a project-specific folder that also contains the MXD file and other files needed for the product (Excel files, Images, etc.)

Wouldn't it be more efficient to put all the products for all the projects in one large FGDB, divided into Feature Datasets?
How big can a FGDB get before the performance starts to decline?
I also have access to an Oracle geodatabase (using ArcSDE). Would that be a better choice?

I'm curios to know what you think.

Thanks,
Shahar.

VinceAngelo · ‎04-29-2013

If you're going to use file geodatabase, then you'd be better off sticking with the
one FGDB per project paradigm. File geodatabases get corrupted from time to
time, so isolating the FGDBs isolates the risk as well.

Shifting to an ArcSDE geodatabase would increase both capability and complexity.
If you don't have a need for versioned editing, then its unlikely that the increase
in complexity (database management overhead, backups, etc) will be worth it.

Feature datasets exist for cooperative editing operations, not as a container
for grouping project data. It is always a bad idea to lump data that needn't
be edited together in feature datasets.

- V

VinceAngelo · ‎04-30-2013

Different situations may produce different answers. I don't have a clear
enough understanding of your requirements to make a suggestion.

- V

ZacharyHart · ‎05-06-2013

Hello all,

I've been thinking about the "right" way to work with geodatabases:

I'm a GIS analyst and my everyday work consists of delivering different GIS products (Maps, Geographic calculations, etc.) for different projects. The data for the products also comes from different sources.

I am currently making a file geodatabase for each product that contains the results and the temporary layers. That FGDB resides within a project-specific folder that also contains the MXD file and other files needed for the product (Excel files, Images, etc.)

Wouldn't it be more efficient to put all the products for all the projects in one large FGDB, divided into Feature Datasets?
How big can a FGDB get before the performance starts to decline?
I also have access to an Oracle geodatabase (using ArcSDE). Would that be a better choice?

I'm curios to know what you think.

Thanks,
Shahar.

In my estimation, this is a very common scenario and dilemma, and very closely resembles a workflow/data structure for our business. I'd be very curious to know a bit more about what each project represents.

What Vince said is true and applicable to how you move forward. However, I don't think we've gotten to the core of addressing how you'd like to improve your data structure.

If you are operating a particular business, you are logically providing a similar service to many clients. Therefore, there should be overlapping or repeatable workflows for each project and therefore a common database schema (attributes) should be easily derived. I say 'easily', but in reality it the process does take time because it involves objectifying attributes of each project. What do I mean by that: you need to look for commonality from project to project. For example, you notice that for several given projects you've identified a description of the client's property name, but have addressed this differently for each project. These attributes could easily fall under one attribute field.

The reason I bring this up is that I am a firm believer that this kind of client based data should be stored in one central database. Unless your business has completely ad hoc workflows like modeling hydrology for one project and then doing viewshed analysis for another. That I can understand. But if you do indeed have repeatable workflows as I suggested, I would highly recommend you begin storing this type of data in a central location as having a single repository for this data has great benefits in beginning to understand the scope of your business better. For example, how quickly could you query say, all of the projects you did of a certain type within a given region for a given time period? How quickly could you generate a list of similar past projects when developing a cost quote for a new project proposal? In short: folder-based project storage severely limits your ability to analyze your data from a business perspective.

FGDB are excellent in terms of speed (especially if close to the server in a publication type environment). We utilize FGDB for heavy 'base layer' datasets which do not require ongoing (and multi-user) editing. I do not necessarily agree that the feature dataset is not intended for a means of organizing data mostly because it provides an opportunity to group similar data of different spatial references under one geodatabase (each FDS having a specific spatial reference). Actually, we have been so satisfied with fGDB for this type of data storage that we are in the process of consolidating this type of base data even further in the fGDB storage type.

No one has really addressed your question regarding performance & DB size for fGDB, and I'm afraid that with our fGDB being quite small (<10GB) I don't' have a good answer for you. I'd suggest you post that type of question (or even your original) on the GIS Stack Exchange as I believe you'll find a better 'real world' response regarding that over there.

Good luck!