Which is the better way of using edit session and insert cursors?

BogiBjornsson · ‎09-13-2016

Greetings!

I would like to pose a question about two snippets of Python code that basically take in a list of events, each of which contain a latitude, longitude and depth like so eventslist = [[ lat1, lng1, depth1], [lat2,lng2, depth2] ... [latN, lngN, depthN]] and input these as points into two separate Geodatabase Featureclasses. I have verified that both these code snippets work but I would like to ask, which snippet in your opinion is the more efficient/correct one in regards to where to start/end the edit operation (edit.startOperation()) and creation/deletion of the insert cursors.

Also if you think another solution is even more preferable then please don't hesitate to suggest, thanks in advance!

SNIPPET 1

#Start an edit session to insert new events. Need to have edit session because of using two insert cursors at the same time. 
edit = arcpy.da.Editor(env.workspace)
edit.startEditing(False, False)

#Iterate over the list of new events, each event contains lat, lng and depth
    for event in newevents:  
        edit.startOperation()

        cursor = arcpy.da.InsertCursor(inFc, ["SHAPE@", "lat", "lng", "depth"]) 
        cursor2 = arcpy.da.InsertCursor(archiveFc, ["SHAPE@", "lat", "lng", "depth"])

        lat,lng,Z = (event[0], event[1], event[2])
        location = arcpy.Point(lng, lat, Z) #Create the 3-D point object
        pointGeom = arcpy.PointGeometry(location) #Create the shape geometry as pointGeometry

        row = [pointGeom, lat, lng, depth]
        row2 = [pointGeom, lat, lng, depth]

        cursor.insertRow(row) #Insert the new events into longterm class using insert cursor 
        cursor2.insertRow(row2) #Insert the new events into archive class using insert cursor2

        del row, row2 
        del cursor, cursor2
   
        edit.stopOperation()
 
edit.stopEditing(True)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

SNIPPET 2

#Start an edit session to insert new events. Need to have edit session because of using two insert cursors at the same time. 
edit = arcpy.da.Editor(env.workspace)
edit.startEditing(False, False)
edit.startOperation()

cursor = arcpy.da.InsertCursor(inFc, ["SHAPE@", "lat", "lng", "depth"])
cursor2 = arcpy.da.InsertCursor(archiveFc, ["SHAPE@", "lat", "lng", "depth"])

#Iterate over the list of new events, each event is a list that contains lat, lng and depth
for event in newevents:
  
    lat,lng,Z = (event[0], event[1], event[2])
    location = arcpy.Point(lng, lat, Z) #Create the 3-D point object
    pointGeom = arcpy.PointGeometry(location) #Create the shape geometry as pointGeometry

    row = [pointGeom, lat, lng, depth]
    row2 = [pointGeom, lat, lng, depth]

    cursor.insertRow(row) #Insert the new events into longterm class using insert cursor
    cursor2.insertRow(row2) #Insert the new events into archive class using insert cursor2

    del row, row2

edit.stopOperation()

del cursor, cursor2  

edit.stopEditing(True)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

JoshuaBixby · ‎09-13-2016

I would say the answer lies somewhere between your two code snippets. The first code snippet is very risk adverse, i.e., it is taking each event as an atomic edit operation. Although this obviously works, and it will minimize how much of a rollback there is if any errors occur, it also incurs the cost of commits with every single event, i.e., it will be slow.

The second code snippet is possibly risky depending upon the number of events to process because all of the updates will be held before committing. At some point, the rollbacks needed for a large number of records will start to slow everything down.

If the number of events is modest, say in the single-digits thousands, I would stick with snippet #2. If you are talking about updating tens of thousands of records or more, I would consider batching/chunking them into more manageable sized updates.

View solution in original post

JoshuaBixby · ‎09-13-2016

I would say the answer lies somewhere between your two code snippets. The first code snippet is very risk adverse, i.e., it is taking each event as an atomic edit operation. Although this obviously works, and it will minimize how much of a rollback there is if any errors occur, it also incurs the cost of commits with every single event, i.e., it will be slow.

The second code snippet is possibly risky depending upon the number of events to process because all of the updates will be held before committing. At some point, the rollbacks needed for a large number of records will start to slow everything down.

If the number of events is modest, say in the single-digits thousands, I would stick with snippet #2. If you are talking about updating tens of thousands of records or more, I would consider batching/chunking them into more manageable sized updates.

BogiBjornsson · ‎09-14-2016

EDIT: Just noticed an error in snippet 2, the editStopOperation() call should be after the del cursor, cursor2 statement otherwise you get an error that the cursors have been invalidated because the edit operation has stopped.

Thanks for the input Joshua, it was very helpful indeed and I'll mark is at the correct answer. However as you point out the efficient way depends on the number of events vs. possible risk of losing edits if something goes wrong. In my case the number of events is quite modest so I'll use the code from snippet 2

JoshuaBixby · ‎09-14-2016

I wouldn't say there is a risk of losing edits, just not getting them committed along with possible performance impacts if the number of edits are too large with one operation. If you are not working with a large number of records, I think snippet 2 represents a much more common approach and one that will likely perform quicker.