All,
Does anyone know if there is a more efficient mechanism to bulk load records into a Branch Versioned Geodatabase Table. This is a console app, so I am using the ApplyEdits Method
ArcGIS Pro 2.5 API Reference Guide
And reading and writing in blocks of 2000 records. The target is to load approximately 1.5 million records. In regular C# code (data transformation, etc) with SQL Commands this takes about 4 minutes. It's taking about 12 minutes per 2000 records with the ApplyEdits Method in the core API. Reading 2000 record batches from the source and writing same. The table has a relationship class tied to a Utility Network feature class, so it needs to be and remain branch versioned.
Any advice would be greatly appreciated.
Daniel,
What kind of geodatabase are you using? I'm seeing significant performance enhancements with file geodatabases and enterprise geodatabases. With feature services, I'm seeing an improvement, probably because in this case the database insertions are already being batched.
--Rich
I'm using a File Geodatabase...
You are correct. I was batching before. I'm inserting the same size batches with the InsertCursor as I was with the EditOperation.Create method... approximately 4400 records/rows for each call to EditOperation.Execute.... I was thinking maybe I should increase this. Unless you have some other things I should consider. Speaking of which... Do you know if there is a "sweet spot" in terms of the maximum number of records/rows per Execute? Additionally, I haven't tried but would you expect any improvements if I use EditOperation.ExecuteAsync since each row/record is independent of the other? Thanks!
That's weird. I repeated my tests today, and am seeing about 3.5x improvement of InsertCursor in a callback vs. EditOperation.Create with a file geodatabase feature class. We're looking into some more ideas and will get back to you soon.
--Rich
Hi Daniel,
We ran a test here where we created 275,000 polylines in a file geodatabase. Using the old CreateRow method we saw it take about 10 minutes. Switching to insert cursors cut it down to 7 minutes. Not as dramatic a difference as seen with enterprise geodatabases, but still a significant improvement.
We might be at the point where the best path forward is to log an issue with tech support. They can work with you to help put together a file geodatabase and code sample. Please point tech support to this thread to ensure the issue gets routed to me.
Sorry I don't have any other good ideas to try.
--Rich
Wow... How time flies.... Was revisiting this issue after working a lot of other things in the interim. Ultimately I found significant improvement in performance when not calling SaveEditsAsync as often as Create... I'm not sure of the right ratio but I believe back then it was a 1:1 ratio of calls to EditOperation.Create to EditOperation.Execute. Now I probably have 100's if not 1000's of calls to EditOperation.Create before calling EditOperation.Execute. Is there a recommended ratio or limit? But that's not why I'm posting again...
With even larger datasets then I've used the Append GP Tool and Table to Table to bulk copy load. Which is far more efficient than EditOperation. To bulk load data I've used Table to Table for upwards of 100K records in seconds from a CSV file. I've also done equally large datasets of point, line, and polygon feature class using Feature Class to Feature Class and Append. But for Table to Table I've only done that with data (i.e. no features). Now I'm looking at the original problem that I posted here and the approach I'm trying is to write to a CSV first. My Add-In calls a native C++ code for all the algorithm processing and currently returns the Geometry and attributes in seconds and then it takes 10's of minutes for the AddIn to actually create all the feature classes and features. The geometry and attributes are a set of points and attributes for those points that are used to make a large number of polyline features to be used as inputs to the 3D Line Intersect with Surface GP Tool. Haven't found a simple way to do this in one step so my plan is to:
1. Write CSV and attributes to CSV file
2. Once file is written, bulk import points into a Table or Point Feature Class
2. Use Point to Line GP Tool to create the line features from the Point Features
Is there an easier more efficient way to do this? Thanks!