Hello y'all,
I am in tough spot. I have four databases (3 enterprise, 1 file geodatabase) that I need to reconcile into a single enterprise database.
The issue with this that I am struggling with is how to reconcile feature classes with attach tables while keeping the relationships intact.
Any help would be appriciated.
Thanks
Solved! Go to Solution.
If you want to start synchronizing information between all of these database, the best thing you can do is to create a single unique identifier that is shared between the system that uniquely identifies each feature. You could add a GUID column (not a global ID column) and use that. If you then have a unique GUID for each object in your database, detecting changes to classes and relationships becomes much easier. You can detect new features, missing features, modified features. You can also audit relationships by looking to see where the parent/child features aren't related or are related to different objects.
If you have to do a binary comparison, you could look at using a utility to calculate a checksum or hash using the contents of a file, then compare the results of the hash. You can find several examples of ways to do this using python or powershell online.
How are you currently planning on reconciling the rest of the data? If your features have unique identifiers that is a good place to start. If all the data originated from a single data source you may be able to rely on a global id on the attachments (if you used a global id based relationship), your next best bet would be to use an content type/attachment name. A binary comparison of the attachments is also possible but should only be done as a last resort because of how expensive it would be.
Hello Robert,
Thank you for your comment.
While some have UIDs of most do not have a UID that is not the system managed objectid. May end up comparing geometry to generate matches. But that still leaves the inconsistent schema and attribution. These are more workable issues though.
The issue I'm grappling with is the mess of relationship classes and relate tables. Any guidance on that or how you would script this process would be greatly appreciated.
I am intrigued by your idea to compare the binaries of the attached tables. We are on-prem, so we do not have cost constraints related to cloud processing. What methods would you use to do this?
If you want to start synchronizing information between all of these database, the best thing you can do is to create a single unique identifier that is shared between the system that uniquely identifies each feature. You could add a GUID column (not a global ID column) and use that. If you then have a unique GUID for each object in your database, detecting changes to classes and relationships becomes much easier. You can detect new features, missing features, modified features. You can also audit relationships by looking to see where the parent/child features aren't related or are related to different objects.
If you have to do a binary comparison, you could look at using a utility to calculate a checksum or hash using the contents of a file, then compare the results of the hash. You can find several examples of ways to do this using python or powershell online.
Robert,
Thank you.
How can I ensure all these GUIDs are unique across 4 databases?
It seems the relationship information is stored in the attached table How would I write the related feature GUID to that an attached table entry?
Thank you for methods to utilize the binary comparison to identify uniqueness and parody/duplication.
You don't want unique GUIDs across all the databases. Create a unique GUID in one of the databases, then put that same GUID on the 'same' feature in all the other databases (using your current comparison logic). So the same 'thing' in different databases always has the same GUID. If you find something in a database, it gets its own unique identifier. That will allow you to uniquely identify every feature, row, and relationship in your databases.
For the attachments, I would recommend you use the attachment name to uniquely identify attachments. So if you see two attachments with the same name, attached to the same feature, you assume they're they same. If you get suspicious that they are actually different, that's when you do the binary comparison.
Let me make sure I am understanding this.
So, the strategy is creating 'localized' non-system managed keys, append the content into the new, consolidated database, then reconstruct the relationship class on the foreign key?
I thought you were just attempting to audit/reconcile existing databases, and not creating a new database. But if you want to use this process to create a new database you can do that as well.
Reconciling four databases is no fun, especially if you have to do this on a regular basis, but establishing and maintaining these keys means you will only have to investigate newly created features in each system.
I am reconciling (and auditing I suppose) existing databases. But I though it prudent to create my reconciliation in a new file geodatabase. Then retire the old ones and place the new one in a weekend operation. I am lucky in that no-one is adding data to any of the databases at this time.
I really do appreciate your help.