Hello,
I am running a script via the ArcPro python terminal, with the layers I am using in my script open in the project. sTabName is a string which contains the name of the table I have opened in ArcPro.
I have a table which has 200k records, I wish to update and insert <20 items within this table.
I need to firstly use an UpdateCursor to update a few rows, then a InsertCursor to insert some rows to replace those I updated.
My code currently:
# Create a table view to use within the edit session
arcpy.SelectLayerByAttribute_management(sTabName, "NEW_SELECTION", where_clause=sOldWhere)
TableViewCount = int(arcpy.GetCount_management(sTabName).getOutput(0))
print(f"sTabName with sql querey: {sOldWhere}, applied. TableViewCount contains: {TableViewCount} rows")
# Start editing session
edit = None # Initialize edit variable
edit = arcpy.da.Editor(SDE_Workspace_Path)
edit.startEditing(False, True)
edit.startOperation()
# Retire the old assessments
print("Retiring old assessment(s)...")
count = 0
with arcpy.da.UpdateCursor(sTabName, ("RecordStatus", "Retired")) as oRows:
for oRow in oRows:
oRow[0] = "Retired"
sretired = "Title retired & replaced by {0}".format(",".join(aNewSIDs))
if len(sretired) > 250:
oRow[1] = 'Title retired & replaced by Multiple parcels'
else:
oRow[1] = sretired
oRows.updateRow(oRow)
count += 1
print(f"retiring item {count} of {TableViewCount}")
The Problem:
This seems to take a very long time to open up the UpdateCursor, >1-2 minutes, and the search cursor which comes after it (not included in code sample).
How can I improve efficiency given that I am only updating the specific items from where_clause=sOldWhere?
I have tried using:
arcpy.MakeTableView_management(sTable_path, "TableView", where_clause=sOldWhere)
instead of what is at Line 2, which does seem to run faster, but this creates a tableview item which I then have to remove (to clean up) which adds more processing time once again.
I wish to reduce the processing time of this script as much as possible as it is manually run multiple times over a day to complete a task.
@RhettZufelt wrote:Have not tested this with 3.x, but with 2.7, if I commented out any print statements, my scripts will run in a fraction of the time.
So, in most all my scripts, I set a debugPrint variable at the top, then I have replaced the print statements with:
debugPrint = True
.
.
.
if debugPrint:print('What ever you want')
So, normally, debugPrint = False so that the script runs much faster, but If I'm having issues, I toggle it to True so I can get the printouts.
R_
Thanks! This is a very elegant way to add debug messages rather than commenting them out like I do at the moment!
Try this, change line 33 to the common field and line 9 & 12.
I haven't tested it yet.
import arcpy
import os
from openpyxl import load_workbook
# Set the workspace (folder or geodatabase) where the Excel file is stored
arcpy.env.workspace = "C:/Temp"
# Path to the Excel file
excel_file = "C:/Temp/file.xlsx"
# Path to the feature class
feature_class = r"feature class"
# Load the Excel workbook
workbook = load_workbook(excel_file)
# Get the list of worksheet names
worksheet_names = workbook.sheetnames
# Loop through each worksheet
for worksheet_name in worksheet_names:
# Create a unique name for the table view by replacing problematic characters
table_view_name = f"View_{worksheet_name.replace('$', '_')}"
# Create a full path to the Excel worksheet
worksheet_path = os.path.join(excel_file, f"${worksheet_name}")
# Create a table view from the Excel sheet
arcpy.MakeTableView_management(in_table=worksheet_path, out_view=table_view_name)
print(f"Table view created for {worksheet_name}")
# Define the join field name, common between the feature class and the Excel sheet
join_field = "Field" # change field
# Add a join between the feature class and the table view
arcpy.AddJoin_management(in_layer_or_view=feature_class, in_field=join_field,
join_table=table_view_name, join_field=join_field,
join_type="KEEP_COMMON")
# Optionally, you can export the joined data to a new feature class
output_feature_class = f"Joined_{worksheet_name.replace('$', '_')}"
arcpy.CopyFeatures_management(in_features=feature_class, out_feature_class=output_feature_class)
print(f"Joined feature class created: {output_feature_class}")
# Remove the join to clean up before the next iteration
#arcpy.RemoveJoin_management(feature_class)
# Delete the table view to avoid locks and clean up
#arcpy.Delete_management(table_view_name)
Hi @EdX1,
Have you tried saving the script to a toolbox? I write my scripts with ArcGIS notebooks and I've noticed that when I run them in Notebooks they take 4x longer to complete, I haven't tried the python window, but from my understanding it shouldn't bog down the scripts unless there is something weird going on.
Anyway I have a feature class that has 3,000 plus records which searches through all of them and updates all of them and when I run it from a tool box it takes ~30 minutes.
I would first try saving your code to a tool box and run it from there then try adapting my code below.
MP.clearSelection() #Clears any selected features
arcpy.SetProgressorLabel("Updating attribute information") #Changes tool progress label
arcpy.SetProgressorPosition(1) #Resets progressor to beginning
arcpy.management.SelectLayerByAttribute(sTabName, where_clause = sOldWhere) #Select features for count
count = arcpy.management.GetCount(sTabName) #Get count of selected features
print("What you want" + str(count))
if int(str(count)) > 0:
i = 0
while i < int(str(count)):
if int((i/int(str(count))*100) > 1:
arcpy.SetProgressorPosition(int((i/int(str(count))*100)) #updates tool progress label
now = datetime.datetime.now() #gets time process started
sTabName.definitionQuery = "Your SQL selection here" #Make it so the search cursor will only have to search the 20 records
Add your edit line here
upcursor = arcpy.da.UpdateCursor(sTabName, ["RecordStatus", "Retired"])
for row in upcursor:
row[0] = "Retired"
sretired = "Title retired & replaced by {0}". format(",".join(aNewSIDs))
if len(sretired)>250:
row[1] = "Title retired & replaced by multiple parcels"
else:
row[1] = sretired
upcursor.updaterow(row)
edit.stopEditing(True) #saves edits and closes edit session
end = datetime.datetime.now() #gets time code ended
elapsed = end - now
print("Updating: " + str(count) + " records from " + str(sTabName.name) + ". Elapse time: " + str(elapsed)) #or something similar
print(What ever you want)
I slightly adapted that for you, but when I run the code above I get about .5 seconds per record. Hope it works for you!
have you tried putting the where clause in the cursors themselves? Might help limit what is being loaded.
R_
Good suggestion! Also allows me to remove the first 4 lines, and does seem to speed it up a small amount.
Thanks.