Shapefile vs. Geodatabase

18936
26
10-16-2014 09:37 AM
ThomasDowling
New Contributor

     I am considering building a geodatabase that would contain a series of polygon shapefiles. I was curious about the advantages of storing polygon data in a geodatabase vs a shapefile? Also where would I begin in building this database? Something to keep in mind is that all the shapefiles are constantly being edited individually and some may not relate or have different parameters than others. Thoughts on this process and is it worth doing?

26 Replies
ChrisDonohue__GISP
MVP Alum

I personally agree that geodatabases seem faster than shapefiles for geoprocessing, but ESRI (see above link) and others in the past have stated that shapefiles are faster.  Maybe that's no longer the case?

While we're at it, a few more points on geodatabases vs. shapefiles to add based on my experience:

  1. Geodatabases (File and SDE) seem far more stable for editing than shapefiles (but that may just be a function of the types of data I deal with).
  2. Geodatabases help avoid some "human" caused errors when one works in an environment where many people have access to the data:
  • With geodatabases, there is a much higher chance when they are created that the creator will assign a coordinate system.  (Shapefiles don't require them at all, so they are oftentimes not assigned).
  • I can't tell you how many times I've had folks who don't really know GIS come to me and say they helped me out in updating the project shapefiles by opening them in Excel and editing the attributes there.... (anguished scream as hours of work goes down the drain: "Nooooooooooo!!!!!!"  ).  Then the helper inevitably says  "But I was able to view and edit them in Excel, why is that a problem?"  Arrrrrrgggghhhh!!!!   If you do decide to go the shapefile route, do what you can to restrict access to the files from casual users and make sure everyone who has access to them knows not to edit them in Excel.

Chris Donohue, GISP

RichardFairhurst
MVP Honored Contributor

A huge reason to use a geodatabase is that Attribute Assistant does not work with shapefiles.  Attribute Assistant should be part of every editing experience, especially if you run a lot of calculations and things like attribute and spatial joins as part of your data maintenance routines.  Those can be eliminated for the most part.

DanaNolan
Occasional Contributor III

Don't get the easier to set up point. It takes 15 seconds to set up a FGDB. Although you do have to think about what to call it.

However, I don't believe shapefiles support templates, which makes setting up a new table with standard field names and defaults much easier. I also really hate that shapefiles never seem to have geoprocessing history in their metadata, something I have to use often when investigating old files I did not create.

0 Kudos
ChrisDonohue__GISP
MVP Alum

I'm not saying it's super-long to set up, but instead that since there are more available options in a geodatabase one usually does take longer to do so.  For example, since one has topology as an option compared to shapefiles, one will be setting up a Feature Dataset in a File Geodatabase to house it and its related feature classes.  For it to be realistic, the set up of the Feature Dataset will need to include a coordinate reference system.  If other feature classes are a component of your topological check, you will need to load them into that Feature Dataset.  Then there are Domains, Relationship Classes, etc.  That's not to say this is an onerous task; but it does take a bit of time to do compared to a shapefile where one just in ArcCatalog right-clicks and picks New, then Shapefile, and is done outside of assigning a coordinate system (which is optional).

FGDB are awesome, but they do take a bit of effort on the front-end to effectively harness that power.

Chris Donohue, GISP

0 Kudos
TheoFaull
Occasional Contributor III

Example.

Client: "Hey can you send over the property boundary polygons for X-location?"

Me: "yeah sure", selects relevant polygons, right clicks layer>export data>shapefile. Zips exported files and emails to client.

A common process, easily accomplished using shapefiles as the final export data.

I don't want to send a whole file geodatabase over that contains not only the polygons I'm concerned with, but also all the other feature classes that sit in the same geodatabase...

0 Kudos
RichardFairhurst
MVP Honored Contributor

There is only one more step if this is a file geodatabase and that is to create a file geodatabase.  Otherwise there are no more steps, especially if this is a selected feature being exported.  And the benefits are no truncated or useless 10 letter field names and domains can be preserved.

I also have never been forced to send out an entire geodatabase or feature dataset if I don't want to, since I can either use Export or Feature Class to Feature Class to avoid that.  In any case, for me personally I only would export to a shapefile if the other party was not using ArcGIS software or perhaps an older version of ArcGIS than 10.1, but not to a person that was using ArcGIS 10.1 or higher.  Anyway, this particular example as described would not lead me to create a shapefile.

TedKowal
Occasional Contributor III

All interesting reasons.... let me give you  a different point of view that was not expressed here so far.

Geometry is only one aspect of GIS,  attached to that geometry (in my shop) is allot of non spatial data that must be shared throughout my enterprise.  I also have numerous non GIS users whom use this data in day to day business.   Plenty have spoken on the spatial side... the other side of GIS is the data (information).

-Using shapefiles is prohibitive (not impossible) in the sharing of data outside of exporting the data into excel spreadsheets.  This causes major data currency problems. 

-Using the ESRI File GDB is much like using a shapefile.  You cannot share data beyond the ESRI product without exporting the data.  There are only buggy beta ODBC drivers to this data set.  Hooking up other products MS Office, reporting software etc is not yet possible.  (Critical drawback)

-Hybrid Database (PDB)  MS Access.  This allows for the sharing of data outside of ESRI products.  Under certain conditions, multiple users can access the data simultaneously.  Cheap version of an GDB enterprise database.

-GDB  (SQL server, Oracle etc).  Expensive, intensive maintenance and more complex.  However, allows for scaling, concurrent editing and sharing of data outside of any given software product.

My shop is planning on implementing an Enterprise Database (future) as the needs and scale dictate.  Cost/Benefit/Risk does not allow me to migrate to an enterprise solution at present.  However, I have designed a hybrid PDB system to maximize the concurrent sharing of data throughout our agency.  The data and spatial data is shared through front end MSAccess DB's that contain links to the various PDB's (ESRI MS Access) and shapefiles (yes we still have few roaming around).  Esri's PDB (MSAccess) is an old version of access...our front ends are the latest MS Access software versions so we are able to share the data that is attached to the geometry throughout the enterprise.  Almost all  third party, without exception, can link with MS Access either directly or indirectly through ODBC.  Lastly, my final jump to enterprise, I hope, has been mitigated because I already have defined the rules on the sharing of the data, data is already in table format and has been normalized at least to the fourth degree.

Hope this gives another perspective on your choices.....

0 Kudos