All,In a recent thread discussed here it was suggested that creating a FeatureLayer from a FeatureClass can improve the performance of arcpy. It was clear from the subsequent discussions that this was not necessarily accepted by others and I was skeptical as I have never seen any reference to this "top tip".So with time to kill I set out to test this and I am reporting my findings for others to mull over.I have 10.2 on a VISTA OS and a I ran everything in Pyscripter. On each run I reset the python interpreter window and reset the source dataset so the conditions were the same. I had stopped all other applications and did not touch the keyboard whilst the code ran.I had created a point shapefile with 1000 random points, my test code simply added a field and then calculated a constant value into this new field. The code repeated these steps 49 times. I record the start and end times so I could work out how long it took. I repeated the test 10 times for each scenario. My null hypothesis is that there is no performance difference.The two test scenarios where:
- Access the source dataset as a full path name, what you typically see in many examples in the ESRI help
- Create a FeatureLayer first and use that instead of the full path name
My code for accessing the full path was:import arcpy
import time
print "START TIME = " + time.asctime()
# Source dataset to update
fc = r"C:\Scratch\TestLayer.shp"
# Create a list of numbers from 1 to 50
l = range(1,50,1)
for i in l:
# Create a field name and add it to dataset
name = "F_" + str(i)
print "Processing " + name
arcpy.AddField_management(fc,name,"LONG")
# Populate field with a constant 999
arcpy.CalculateField_management(fc,name,"999","VB")
print "END TIME = " + time.asctime()
My code for accessing a FeatureLayer was:import arcpy
import time
print "START TIME = " + time.asctime()
# Source dataset to update
fc = r"C:\Scratch\TestLayer.shp"
fl = "TestLayer"
arcpy.MakeFeatureLayer_management(fc,fl)
# Create a list of numbers from 1 to 50
l = range(1,50,1)
for i in l:
# Create a field name and add it to dataset
name = "F_" + str(i)
print "Processing " + name
arcpy.AddField_management(fl,name,"LONG")
# Populate field with a constant 999
arcpy.CalculateField_management(fl,name,"999","VB")
print "END TIME = " + time.asctime()
For the 10 test runs the mean time for running the code for:
- full path name was 22 seconds
- FeatureLayer was 19.5 seconds
I did a Two-Sample T-Test in Minitab which is significant which means I reject my null hypothesis.Two-sample T for fl vs fc
N Mean StDev SE Mean
fl 10 19.500 0.527 0.17
fc 10 22.000 0.816 0.26
Difference = mu (fl) - mu (fc)
Estimate for difference: -2.500
95% CI for difference: (-3.155, -1.845)
T-Test of difference = 0 (vs not =): T-Value = -8.13 P-Value = 0.000 DF = 15
So for this simple scenario there was a significant difference in performance by creating a FeatureLayer first...Interesting hey?