branch versioning vs. mass updates

5123
17
Jump to solution
02-02-2023 07:25 AM
Labels (3)
RiccardoKlinger
Frequent Contributor

Hi folks, 

we are switching from traditional versioning to branch versioning as we are also switching from direct db access to a servcie based architecture. 

In the past we made bulk updates on tables directly via sql but disabled versioning prior these activities and re-enabling the traditional versioning after this update. This is against the "recommodation" from esri but it was working quite good. 

As the new branch versioning is new to us we also want to follow the proposed workflow from Esri but as a consequence we will most likely see a grow of our tables by factor 2 with every bulk update. This will be "mad" after a few weeks in our use case....
Of course we could also stop the feature service, disable versioning do our "mass updates" via sql or feature servcie and enable versioning again. But this will b e not in line with the esri policy, will it?

Are there any other users facing a similar problem?  How have you overcome this problem? 

I have seen this comment from the 2022 UC 

Can the same feature class be set to use both traditional and branch versioning?A dataset can only have one registration type. It would either be traditional or branch versioned.This pattern should be consistent for all datasets found within a feature dataset.

 

Looking forward your replies.

 

Best, Riccardo

1 Solution

Accepted Solutions
MarceloMarques
Esri Regular Contributor

Hello @RiccardoKlinger 

Please, bear with me. I can see 2 scenarios about "branch versioning vs. mass updates" that you have already described.

A. if the branch versioned featureclass is small with just some million rows and the actual featureclass table size in the Enterprise Geodatabase RDBMS Storage is around let's say 20GB or so.

  1. then you can open Pro
  2. connect to Portal
  3. add the Feature Service to the map
  4. and perform the mass attribute update using the Pro editing tools directly in the branch version sde.default
  5. can even use model builder to automate the process for example.
  6. hence there is no need in this scenario to unregister as branch versioned the featureclass(es)

B. on the other hand if the branch versioned featureclass is quite large with 500 milllions rows for example and the featureclass table size in the Enterprise Geodatabase RDBMS storage is around let's say 1 TB.

  1. then open Pro
  2. reconcile and post all branch versions child of sde.default
  3. close Pro
  4. take a full database backup
  5. stop the ArcGIS Server Feature Service(s)
  6. need to do this because of locks that might prevent from unregister as branch versioned
  7. open pro and unregister as branch version the featureclass(es) then close Pro
  8. perform the mass attribute update using SQL statement
  9. sure thing the mass field update will be faster via SQL
  10. here consider to ask your database administrator to place the database in nologging (oracle), simple recovery mode (sql server) before you do the mass update via SQL, this will speed up the mass update via SQL
  11. after the update is done the database administrator can change the database back to logging (oracle), full recovery mode (sqlserver)
  12. open pro and register as branch version the featureclass(es) again and close Pro
  13. start the ArcGIS Server Feature Service(s)
  14. test the feature service and make sure you can make edits in a new branch version child of sde.default
  15. if everything is fine then take another full database backup

@RiccardoKlinger this is pretty much the same process that you have already described in your original question, but I just wanted to create a step by step to clarify for anyone reading this discussion.

I do not see any major problems with Scenario B and you have also mentioned that this is working fine for you, hence although this might not be officially sanctioned by Esri yet. I do not see any other way of doing the branch version mass update for very large featureclasses, Scenario B is the more appropriate way that avoids any data corruption.

Thanks,

| Marcelo Marques | Esri Principal Product Engineer | Cloud & Database Administrator | OCP - Oracle Database Certified Professional | "In 1992, I embarked on my journey with Esri Technology, and since 1997, I have been working with ArcSDE Geodatabases, right from its initial release. Over the past 32 years, my passion for GIS has only grown stronger." | “ I do not fear computers. I fear the lack of them." Isaac Isimov |

View solution in original post

17 Replies
jcarlson
MVP Esteemed Contributor

A Branch Versioned service is by definition published to a Portal, and so has access via a REST endpoint. Using editing operations through, for instance, the ArcGIS Python API, you can easily make bulk edits without stopping the service. You can also specify the gdbVersion parameter, in the event that you do not want to make bulk edits to the default version.

- Josh Carlson
Kendall County GIS
RiccardoKlinger
Frequent Contributor

thanks for the insight @jcarlson . So there is no need to update any sidecar table / take care of any othe table? The table structure is easy to understand and to my understanding it is just important not to skrew up the additonal fields that are relevant for the versioning. python API edits directly against the db or also via the service? confused a bit.

0 Kudos
jcarlson
MVP Esteemed Contributor

Using the API is strictly editing against the service, so there's no need to worry about edit version uuids or whatnot, as long as you're specifying a valid version or editing the default version. The REST endpoint converts it all into valid SQL in the background anyway.

What kind of update are you making? Do you have some other datasource you're applying to the versioned service, or is it a sort of field calculation?

Now, you could still edit directly against the DB, but we tried that for a while and it ended up introducing errors, no matter how careful we were with the expression. Using "proper methods" has worked just fine.

- Josh Carlson
Kendall County GIS
0 Kudos
RiccardoKlinger
Frequent Contributor

but also edits to default will increase the number of rows by multiples,,,

0 Kudos
jcarlson
MVP Esteemed Contributor

An important consideration! If you're worried it will impact performance or storage size, I would suggest testing the scenario and getting metrics on it. We don't find this matters much to us, but our largest layer is still only about 100k features with very simple geometry, and our bulk edits only hit a small subset of the rows on any given occasion.

What is your bulk update for? Are the edited attributes / features crucial for those who are editing the service, or is it primarily for users who will only be viewing the service?

- Josh Carlson
Kendall County GIS
DavidSolari
Frequent Contributor

The table structure for Branch Versioned data is much simpler than Traditional Versioning, it's basically a tweaked version of the format used by Archive-only tables. If you want to make direct edits via SQL, start by reconciling/posting/deleting all your child versions as appropriate, then add records in accordance with how ArcGIS clients do. Off the top of my head for SQL Server EGDBs:

  • Inserts add a new record with a new GDB_ARCHIVE_OID, OBJECTID and GlobalID. The GDB_FROM_DATE is the UTC date of the transaction (this is different IIRC from classic archive tables, which use the local time).
  • Updates add a new record with the same OID and GID but a new GDB_ARCHIVE_OID. Same rules for GDB_FROM_DATE.
  • Deletes are the tricky one. In my initial testing it looked like you just grab the most recent record, set the delete flag and fill out the appropriate delete fields. In a later testing session it looked like deletes added a whole new record that's just a copy of the final record, but with the delete info set and the GDB_FROM_DATE set to the delete time. Make sure you test thoroughly.
  • Almost goes without saying, but the state you set should always be 0. If you want to make edits against a specific version then good luck.

Basically just observe how all your clients adjust the table under a variety of circumstances and then mimic that in SQL. I understand why nobody from ESRI will confirm how the operations work from the DB side but that would be a good resource, pretty please?

MarceloMarques
Esri Regular Contributor

Esri content to explore about Branch Versioning to learn more about some of the technical details.

To Branch or Not to Branch - An Introduction to Branch Versioning (esri.com)

Enterprise data management strategies—ArcGIS Pro | Documentation

Branch version scenarios—ArcGIS Pro | Documentation

SQL Views and Branch Versioning 

Utility Network Journey: Branch Versioning and SQL

I hope this helps.

| Marcelo Marques | Esri Principal Product Engineer | Cloud & Database Administrator | OCP - Oracle Database Certified Professional | "In 1992, I embarked on my journey with Esri Technology, and since 1997, I have been working with ArcSDE Geodatabases, right from its initial release. Over the past 32 years, my passion for GIS has only grown stronger." | “ I do not fear computers. I fear the lack of them." Isaac Isimov |
0 Kudos
George_Thompson
Esri Notable Contributor

Here is a blog talking about SQL & Branch Versioning: https://www.esri.com/arcgis-blog/products/utility-network/electric-gas/utility-network-sql/

I would recommend against directly editing the table.......

As for the size, there is an idea: https://community.esri.com/t5/data-management-ideas/trim-archive-history-tool-for-branch-versioned/i... that would be to trim the table. Go over and add your use case or Kudos.

Hope this helps!

--- George T.
MarceloMarques
Esri Regular Contributor

Caveat. Direct SQL access to Branch Versioned Featureclasses tables is not supported. You need to use the REST Endpoint Service instead to manipulate the data. I hope this clarifies.

This is a good read to catch some important points.

Utility Network Journey: Branch Versioning and SQL

https://www.esri.com/arcgis-blog/products/utility-network/electric-gas/utility-network-sql/

| Marcelo Marques | Esri Principal Product Engineer | Cloud & Database Administrator | OCP - Oracle Database Certified Professional | "In 1992, I embarked on my journey with Esri Technology, and since 1997, I have been working with ArcSDE Geodatabases, right from its initial release. Over the past 32 years, my passion for GIS has only grown stronger." | “ I do not fear computers. I fear the lack of them." Isaac Isimov |