Hello,
I have the following script. It works as expected, but it takes around 1.5 seconds to run each time. Doesn't seem to matter if it's running against 20 parcels or 5000. Are there any suggestions as to how I can improve the performance of this script.
Thanks
import arcpy
import numpy as np
try:
arcpy.env.overwriteOutput = True
#Create in memory tables:
arcpy.CreateTable_management("in_memory", "tableSelRecs")
arcpy.AddField_management(r'in_memory\tableSelRecs', "USE_CATEGORY", "TEXT", field_length=35)
arcpy.AddField_management(r'in_memory\tableSelRecs', "SHAPE_Area", "Double")
arcpy.CreateTable_management("in_memory", "tableSumRecs")
arcpy.AddField_management(r'in_memory\tableSumRecs', "USE_CATEGORY", "TEXT", field_length=35)
arcpy.AddField_management(r'in_memory\tableSumRecs', "SHAPE_Area", "Double")
mxd = arcpy.mapping.MapDocument('CURRENT')
df = mxd.activeDataFrame
totalArea = 0
totalResArea = 0
totalParcels = 0
totalResParcels = 0
lu_layer = arcpy.mapping.ListLayers(mxd, "LandUse_UseCategory", df)[0]
#use da.SearchCursor to access necessary fields from selected records
with arcpy.da.SearchCursor(lu_layer,['USE_CATEGORY','SHAPE@AREA']) as cursor:
#Setup insert cursor to insert selected polygons into tableSelRecs
insCursor = arcpy.da.InsertCursor(r'in_memory\tableSelRecs', ['USE_CATEGORY','SHAPE_Area'])
#Isert selected rows into insCursor;
#Get total area of selected records;
#Get a count of total parcels
for row in cursor:
if row[0] != 'Invalid':
insCursor.insertRow((row[0],row[1]))
totalArea += row[1]
totalParcels += 1
#If the block is of a 'Residential' USE_CATEGORY type then add that value into the Residential DI calculation variable
if row[0] in('Residential - Low Density','Residential - Medium Density','Residential - High Density'):
totalResArea += row[1]
totalResParcels += 1
#use summarystatistics against tableSelRecs to get totals per USE_CATEGORY and write results to tableSumRecs
arcpy.Statistics_analysis(r'in_memory\tableSelRecs', r'in_memory\tableSumRecs', [["SHAPE_Area", "SUM"]], "USE_CATEGORY")
with arcpy.da.SearchCursor(r'in_memory\tableSumRecs',['USE_CATEGORY','SUM_SHAPE_Area']) as selCur:
#list to store calculation for each USE_CATEGORY value used in DI calculation
interimValue = [0]
#list to store calculation for each USE_CATEGORY value used in RDI calculation
interimResValue = [0]
for sRow in selCur:
arcpy.AddMessage('USE_CATEGORY: {}; AREA: {}'.format(sRow[0],sRow[1]))
#divide the USE_CATEGORY area by the total community area then square that number & append it to the list
interimValue.append(np.square(sRow[1]/totalArea))
if sRow[0] in('Residential - Low Density','Residential - Medium Density','Residential - High Density'):
interimResValue.append(np.square(sRow[1]/totalResArea))
#sum the values in the list then subtract that value from 1 to get the DI for the selected blocks
DI = 1 - sum(interimValue)
#sum the values in the list then subtract that value from 1 to get the RDI for the selected blocks
RDI = 1 - sum(interimResValue)
arcpy.AddMessage('Diversity Index: ' + str(DI))
arcpy.AddMessage('Residential Diversity Index: ' + str(RDI))
arcpy.AddMessage('Total Blocks Selected: ' + str(totalParcels))
arcpy.AddMessage('Total Blocks Area: ' + str(totalArea))
arcpy.AddMessage('Total Res Blocks Selected: ' + str(totalResParcels))
arcpy.AddMessage('Total Res Blocks Area: ' + str(totalResArea))
arcpy.AddMessage('NOTE: Any blocks with a use category of Invalid are not included in any calculations.')
#Delete in memory tables
arcpy.Delete_management(r'in_memory\tableSelRecs')
arcpy.Delete_management(r'in_memory\tableSumRecs')
except arcpy.ExecuteError:
print(arcpy.GetMessages(2))
Solved! Go to Solution.
Lots of 'speed' issues this week, so
Suggestions
more questions, but this will be a start.
Lots of 'speed' issues this week, so
Suggestions
more questions, but this will be a start.
Thanks Dan for the comments and suggestions.
Did any of those suggestions in particular make a significant difference in the performance of your script?
Hi Blake, yes I found that removing the reference to the mxd file and instead using parameters to pass in the feature layer improved overall performance down to around 0.7 seconds. I think the calculations perform very fast it’s just the overhead getting there.
but it takes around 1.5 seconds to run each time
Is that correct? Are you looking to have it run even faster than 1.5 seconds?
Good point...
What IDE are you using?
Some have the option to do 'module reloading' everytime a script is run. Check your ide to see.
That will affect timing a whole load, and you will need a decorator to time function (I can provide if you do a lot of testing)
I have no idea what a decorator is.
A good point. It just seems odd to me that it's no faster when run with a small number of parcels.
Chris... for decorators, you need to have functions... here is an example script setup, that basically returns a list of numbers. I put sleepy time in there to get a meaningful time back. The if __name__ == '__main__' section calls and runs the script. if your ide uses IPython (or others) you can use things like %timeit to run lines of code or functions etc. Decorators are selective in that you can choose to 'decorate' just the functions that you want (line 20 is the 'decorator' kind of like a special line above a function)
def time_deco(func): # timing originally
"""timing decorator function
:print("\n print results inside wrapper or use <return> ... ")
"""
import time
from functools import wraps
@wraps(func)
def wrapper(*args, **kwargs):
t0 = time.perf_counter() # start time
result = func(*args, **kwargs) # ... run the function ...
t1 = time.perf_counter() # end time
dt = t1 - t0
print("\nTiming function for... {}".format(func.__name__))
print(" Time: {: <8.2e}s for {:,} objects".format(dt, len(result)))
return result # return the result of the function
return dt # return delta time
return wrapper
@time_deco
def _demo():
"""
: -
"""
import time
time.sleep(1)
result = [1, 2, 3]
return result
# ----------------------------------------------------------------------
# __main__ .... code section
if __name__ == "__main__":
"""Optionally...
: - print the script source name.
: - run the _demo
"""
# print("Script... {}".format(script))
result = _demo()
Now if you don't want a full-fledged timer, examine lines 10-13, you can start, stop and get time differences and print the result out.. just like the decorator. perfcounter is the preferred counter for windows.