I'm just beginning research into PostgreSQL, with more questions that answers at the moment.
Our history is that we are a long-term ESRI product user, and have been using SDE since version 3.2, always in Oracle.
My direction now is to investigate migrating spatial data to PostgreSQL. I'd like to do this on Linux.
For those who have gone before me:
- how big is the market share of GIS users in PostgreSQL for production use?
- What flavor of Linux is most commonly used? I see that RedHat and SUSE are supported.
- I see this latest version of PostgreSQL noted:
PostgreSQL 9.3.5 (64 bit) with PostGIS 2.1
But I also see may flavors of PostgreSQL advertised, Advanced Server from EnterpriseDB, PostgreSQL Plus.
What flavor do you use?
What has been your experience?
Where have you trained?
PostGIS - is this required?
Any insights or comments are appreciated, I apologize in advance for the "newbie-ness" of the question.
I've been working with PostgreSQL since 8.1, but not in a production environment, so I'll pass on most of your questions. PostgreSQL is a "grown up" database, not a toy.
My pet peeve is with the frequency of the release schedule -- if you upgrade with every minor release, you'll spend your days doing nothing but upgrades. This update frequency also manifests in a compatibility issue with ArcGIS -- Esri supports three major revs (e.g. 9.1, 9.2, and 9.3 at ArcGIS 10.3.1), but PG has moved on to 9.5. For the most part, it doesn't matter that Esri ships a 9.3.5 client when the production release is at 9.3.11, but it's always wise to test for compatibility, since there was an incident way back at 8.3.something where the underlying structure of date types changed in a minor rev, making all older binaries obsolete. Following industry-standard best practices (dev, test, and prod environments) should be enough to keep you safe.
PostGIS is not required, but it's not incompatible with ArcGIS. I think you can use any PostGIS with Query Layers, but if you plan to enable a geodatabase then you should use the supported PostGIS implementation for the release of PostgreSQL in use. As with other native geometry implementations, it's probably advisable to review client support for SDE.ST_GEOMETRY, and if that isn't an issue, evaluate performance using both storage options with your own data before making a final decision on which geometry storage to use.
We'll stick with ST_GEOMETRY as that's what we're doing now.
My first task is to prototype and do a feasibility study. Our Oracle environment is mature with many Enterprise applications hitting it. In some cases, the interaction of spatial data with SDE is back and forth, where SDE uses the tabular application data as attributes, or the application uses spatial data and maps. It's a huge migration that we'll try to break down into smaller pieces and come up with a plan.
A little daunting, but it will be fun.
In our Oracle environment we use partitioning and I non-active Data Guard for disaster recovery.
I'm finding different flavors of PostgreSQL, so I'll need to research what functionality is available where. Do you know if I start with PostgreSQL Red Hat binaries, and we later decide that we need EnterpriseDB, is that a migration, or something less painful.
Do you have a feel for the ESRI user base for PostgreSQL. We looked at it years ago, and at that time there were two users. I see more questions about it and feel it's becoming more popular and accepted.
In the end, I believe this will be a management decision on cost versus risk. Our Oracle environment is solid, it provides good performance with much functionality, we have Oracle support. I know that I can make Postgres work, there will be a migration period, and there may be issues because we are not as skilled (yet). Much of our experience will move over and be useful, but there is a learning curve to get through.
We had no issues if we stick with st_geometry, except for interoperability with other software (non-esri). pg_geometry (native postgis format) support is a bit sketchy sometimes. For instance, you cannot rename a feature class that uses the native spatial type. Which is really odd and annoying. The docs say there isn't a way to rename a spatial table in postgis so arcgis doesn't support it - this is a really awkward statement. We rename spatial tables all the time, so don't know what came over esri on this one...
Also st_geometry does not support the same range of spatial aggregation operators of the native postgis format. But this is small use-case I guess.
EDB Postgresql Plus is the same basic Postgresql plus some extra functionality but it's 100% compatible. So you can start with vanilla pgsql and then move to edb's if you want. I think it's an in-place replacement.
Coming from Oracle, my biggest surprise was memory management. Postgresql relies on the OS cache management, so you will have to account for that when sizing/monitoring your server. Even if your connections take X GBs, there will have to be Y GBs free for the OS cache or your performance will suffer. Another thing I found different was the forthcoming of postgresql community.
Any updates to OP regarding lessons learned for migrating large Oracle SDE datasets to PostgresSQL? Interested in which versions are supported and if any performance / stability improvements?
Thanks - C