Fast Insertion of Multiple Features

1280
7
03-11-2019 11:22 AM
MatthewJackson3
New Contributor II

This may be a duplicate of Writing Features into FGDB very slow, but since the answer to that is over a year old, I thought it'd be worth seeing if there is a solution to the issue, since the final post from ESRI stated it would be re-evaluated after 2.2 was released.

I am currently working on an arcgis pro plugin (v. 2.3) that will be placing a number of features into a feature class (somewhere in the range of 50-1000 features), and then onto the map. When I tested adding features with the ArcGIS Pro SDK, I used the following code snippet:

await QueuedTask.Run(async () =>
{

var geodb = new ArcGIS.Core.Data.Geodatabase(new ArcGIS.Core.Data.FileGeodatabaseConnectionPath(new Uri(mypath)));

var fc = geodb.OpenDataset<FeatureClass>("MyVector");
for (int i = 0; i < 1000; ++i)
{
     var buffer = fc.CreateRowBuffer();
     buffer["name"] = "a name";
     buffer["widget_count"] = i;
     var f = fc.CreateRow(buffer);
     f.SetShape(ArcGIS.Core.Geometry.MapPointBuilder.CreateMapPoint(i * .01, 0));
}

fc.Dispose();

geodb.Dispose();

});

Unfortunately, it's taking ~10 sec to complete this simple set of inserts on a fairly high-end development machine. Just to make sure it wasn't due to the debugger, I then ran the add-in without the debugger, and verified I got the the same rough time.  When I execute the same task with GDAL/OGR in C# using the ESRI FileGDB driver, it consistently completes in 0.5 sec. Is there a more performant way to do this using the ArcGIS Pro SDK API? 

Along those lines, every time I'm executing an arcgis pro toolbox method, let's say for example, "Add Field", is it behind the scenes merely spawning a python process and running the function? If I am using multiple toolbox methods sequentially, would I expect better performance wrapping these calls inside a single python script instead?

Thanks for your time,

Matt

0 Kudos
7 Replies
RichRuh
Esri Regular Contributor

Hi Matt,

A couple of things to try:

1. Create and reuse a single RowBuffer rather than creating a new one every time through the loop.

2. Set the shape in the RowBuffer before creation rather than updating the row after creation.

buffer[shapeFieldName] = ArcGIS.Core.Geometry.MapPointBuilder.CreateMapPoint(i * .01, 0)

As a general rule when editing inside an Add-in, you should consider using the EditOperation class, which redraws the screen and modifies the undo/redo stack.  However, if your data hasn't been added to a map yet, and you have no need for undo, then this code is fine.

Hope this works,

--Rich

0 Kudos
MatthewJackson3
New Contributor II

Rich,

Thank you for your reply. I gave your solution a try, but it doesn't seem to affect the run time in any significant way (10.5-11 sec, which is more or less what I was getting with my prior code). I had tried the EditOperation class first, but after seeing the performance, I had hoped the rowbuffer approach would fare better.

Additionally, upon further benchmarking, I am seeing a minimum execution time of 2.5 sec for any geoprocessing command, even those as simple as adding a field to a vector. Is this more or less what is expected? Are there any changes on the horizon that are expected to ameliorate this issue?

Regards,

Matt

0 Kudos
RichRuh
Esri Regular Contributor

Matt-

There's no geoprocessing code in the sample you included above.

Do you have any software components that are listening to edit events?

--Rich

0 Kudos
MatthewJackson3
New Contributor II

Rich,

At some point I will be listening to edit events, but it is not enabled at present. 

Pasted below is a sample class that I wrote as I was learning to use the API for geodatabases, which were largely based on snippets of codes from the SDK examples. Looking around the forums, the somewhat slow geoprocessing operations doesn't seem to be an uncommon issue, but I really just wanted to confirm that what I am seeing is the expected behavior at present so I can figure out the next steps to workaround the issue. As someone new to the ESRI SDK, I also wanted to make sure I'm not doing something in a boneheaded way that is causing the issue in first place.

Thanks,

Matt

 public enum GeoDBStatus
    {
        CreationInProgress = 0,
        Created = 1,
        Error = 2
    }
    public static class GeoDatabaseUtilities
    {
        public static async Task<bool> AddVectorToDB(string geodatabase_path, string LayerName, string GeometryType="POINT", int EPSG=4326)
        {
            var sr = ArcGIS.Core.Geometry.SpatialReferenceBuilder.CreateSpatialReference(EPSG);
            bool r = await ExecuteToolAsync("CreateFeatureclass_management", Geoprocessing.MakeValueArray(geodatabase_path, LayerName, GeometryType, null, null, null, sr));
           // r = r && await ExecuteToolAsync("AddFields_management", Geoprocessing.MakeValueArray(geodatabase_path + "\\MyVector", "sidc TEXT SIDC;PersonnelCount LONG PersonnelCount"));
            return r;
        }
        public static async Task<bool> AddDataTable(string geodatabase_path, string TableName)
        {
            return await ExecuteToolAsync("CreateTable_management", Geoprocessing.MakeValueArray(geodatabase_path, TableName));
        }
        public static async Task<bool> AddFields(string  path, IEnumerable<Tuple<string, string>> fields)
        {
            return await ExecuteToolAsync("AddFields_management", Geoprocessing.MakeValueArray(path, string.Join(";", fields.Select(x => x.Item1 + " " + x.Item2))));
        }
        public static async Task<GeoDBStatus> Create(string directory, string name)
        {
            var result = await ExecuteToolAsync("CreateFileGDB_management", Geoprocessing.MakeValueArray(directory, name, "Current"));
            if (result)
                return GeoDBStatus.Created;
            else
                return GeoDBStatus.Error;
        }
        public static async Task<bool> ExecuteToolAsync(string tool, IReadOnlyList<string> parameters, Dictionary<string, string> Environments = null)
        {
            try
            {
                return await ArcGIS.Desktop.Framework.Threading.Tasks.QueuedTask.Run(async () =>
                {
                    var cts = new CancellationTokenSource();
                    var results = await Geoprocessing.ExecuteToolAsync(tool, parameters, Environments, cts.Token, (eventName, o) =>
                    {
                        System.Diagnostics.Debug.WriteLine($@"GP event: {eventName} {o}");
                    }, GPExecuteToolFlags.GPThread);
                    return true;
                });
            }
            catch (Exception ex)
            {
                MessageBox.Show(ex.ToString());
                return false;
            }
        }
    }
0 Kudos
RichRuh
Esri Regular Contributor

Matt--

I created a simple feature class in a file geodatabase and executed the following code in an add-in.

private string InsertRecords()
{
  var geodb = new ArcGIS.Core.Data.Geodatabase(new ArcGIS.Core.Data.FileGeodatabaseConnectionPath(new Uri("path to file geodatabase")));

  var fc = geodb.OpenDataset<FeatureClass>("InsertionTest");

  var fcDefinition = fc.GetDefinition();
  string shapeFieldName = fcDefinition.GetShapeField();

  var buffer = fc.CreateRowBuffer();

  DateTime start = DateTime.Now;

  for (int i = 0; i < 1000; ++i)
  {
    buffer["Name"] = "a name";
    buffer["WidgetCount"] = i;
    buffer[shapeFieldName] = ArcGIS.Core.Geometry.MapPointBuilder.CreateMapPoint(i * .01, 0, 0);
    fc.CreateRow(buffer).Dispose();
  }

  DateTime end = DateTime.Now;
  TimeSpan elapsedTime = end - start;

  fc.Dispose();
  geodb.Dispose();

  return string.Format("{0} milliseconds", elapsedTime.Milliseconds);
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

The only real change I made to your code (besides my original set of suggestions) was to call Dispose() on the row returned from FeatureClass.CreateRow().

I ran the code 5 times, and it averaged 608 milliseconds, much faster than the 10 seconds you are seeing.  I would suggest submitting an incident to technical support- maybe they can determine if there is something wrong with your file geodatabase.

As for the geoprocessing questions, I'm the product engineer for the Geodatabase Pro SDK, and this is outside of my area of expertise.  Hopefully someone on the Geoprocessing‌ team can comment.

--Rich

0 Kudos
MatthewJackson3
New Contributor II

Rich,

I believe:

string.Format("{0} milliseconds", elapsedTime.Milliseconds);

should be:

string.Format("{0} milliseconds", elapsedTime.TotalMilliseconds);

Can you re-run your case with that change and let me know what kind of times you are seeing? In the meantime I will start a new geodatabase and see if somehow something has gotten corrupted in the one I'm using. Appreciate your help getting this sorted out...

Thanks,

Matt

0 Kudos
RichRuh
Esri Regular Contributor

Oops, sorry about that.  The numbers for my 5 runs vary between 2438 and 2874, with an average of 2572 milliseconds.

0 Kudos