Real-time Geodatabase Replication? Part 1

ThomasColson · ‎01-01-2015

I have tried, and failed, many times to get geodatabase replication using ESRI tools to work, work reliably, or work at all. Perhaps it's all the un-authorized SQL tinkering I do. Even when I do get it working, I'm not a big fan of having to fire up Arc Toolbox and push buttons to make replication happen. Not a big fan of writing Python code, either.

With SQL server, there IS a way to replicate an entire database, or parts of of a database. The part of a database is handled with Merge Replication, in which you define what tables to replicate, and what criteria under which data is replicated. I like that option! Consider this environment:

I have a water quality geodatabase (in SQL), which is big, complex, has lots of legacy stuff in it that I just don't feel like dealing with right now. Replicating or mirroring the entire database is not an option....because.....in a brilliant management decision, the "datacenter" was placed in a building very far from any modern telco hubs. On a good day I can actually download email attachments. The problem is, the users of this database are every where else but in this building.

More drama: We're drinking the Portal for ArcGIS Cool Aid and high on Collector for ArcGIS. Can't really connect my database to either of those technologies using carrier pigeons.

But I do have some very limited in capacity database and application servers in the "cloud". Certainly not the kind that can house a 7GB water quality database, but here's an idea. What if I could replicate the feature classes that are most needed operationally to these cloud platforms? Field users could get to their data, and I could consume it in my remote datacenter in other applications. All they need access to is the "dots on the map" for their mobile devices, desktop applications, and management access to web-based maps (what stream did that oil spill impact?).

Here's what we'll need:

An obnoxiously massive and complex SDE/SQL database called "FISH";
A feature class that's been around forever and has a ton of data called "TEST";
A remote SDE/SQL database called "FISHREPL" which is the same SQL/SDE version as "FISH";
A horrible internet connection.

This is a multi-part blog post, here are the steps:

Real-time Geodatabase Replication? Part 1

Real-time Geodatabase Replication? Part 2

Real-time Geodatabase Replication? Part 3

Real-time Geodatabase Replication? Part 4

Real-time Geodatabase Replication? Part 5

Real-time Geodatabase Replication? Part 6

And of course a few caveats: This only works with unversioned data which places quite a limit on things. In some cases the cons of having unversioned feature classes are not as great as the pros of having a real-time replication service.

Please bear in mind that this method of replication is extremely unsupported. I just did it because I like tinkering under the hood of SDE and wanted to see if it would work. It "sort of" does! However in contrast to the out-of-the-box ESRI replication tools, this process sure resolves a lot of headaches for me. I can attach user-editors to a local SDE instance where "bulk" editing occurs and SQL Merge Replication will fire those changes to the "cloud" SDE instance where the rest of my organization can access (and edit) the data through Portal for ArcGIS, and my mobile clients can perform edits and updates using Collector for ArcGIS. And it works continuously. Full-time. Even with file attachments-enabled and 2 mb photos being attached to a point those changes sync within 60 seconds. SQL replication in 2012 is very efficient in compressing data and ONLY sending deltas. When I generate a "snapshot" of a feature class with file attachments, SQL sees 50 mb of data that needs to be replicated. Performing a versioned-checkout of the same data using the Distributed Geodatabase tools results in a 220 mb file geodatabase.

But emphasis on "experimental". There is a lot of overhead to managing SQL Merge Replication and you can really screw up some operational data. If you get it working after following this blog series I suggest you spend some time deliberately breaking things and see if you can recover your data integrity! Otherwise stick to the ESRI replication tools. They work. When you push the button.....

This is a personal blog and does not recommend, endorse, or support the methods described above. Alteration of data using SQL outside of the ESRI software stack, of course, is not supported and should not be applied to a production database without a thorough understanding and disaster recovery plan.