Hornbydd

Does creating a FeatureLayer speed up arcpy processing? Looks like it...

Discussion created by Hornbydd on Oct 4, 2013
Latest reply on Oct 4, 2013 by csny490
All,

In a recent thread discussed here it was suggested that creating a FeatureLayer from a FeatureClass can improve the performance of arcpy. It was clear from the subsequent discussions that this was not necessarily accepted by others and I was skeptical as I have never seen any reference to this "top tip".

So with time to kill I set out to test this and I am reporting my findings for others to mull over.

I have 10.2 on a VISTA OS and a I ran everything in Pyscripter. On each run I reset the python interpreter window and reset the source dataset so the conditions were the same. I had stopped all other applications and did not touch the keyboard whilst the code ran.

I had created a point shapefile with 1000 random points, my test code simply added a field and then calculated a constant value into this new field. The code repeated these steps 49 times. I record the start and end times so I could work out how long it took. I repeated the test 10 times for each scenario. My null hypothesis is that there is no performance difference.

The two test scenarios where:

  • Access the source dataset as a full path name, what you typically see in many examples in the ESRI help

  • Create a FeatureLayer first and use that instead of the full path name

My code for accessing the full path was:

import arcpy
import time


print "START TIME = " + time.asctime()


# Source dataset to update
fc = r"C:\Scratch\TestLayer.shp"


# Create a list of numbers from 1 to 50
l = range(1,50,1)


for i in l:
    # Create a field name and add it to dataset
    name = "F_" + str(i)
    print "Processing "  + name
    arcpy.AddField_management(fc,name,"LONG")


    # Populate field with a constant 999
    arcpy.CalculateField_management(fc,name,"999","VB")


print "END TIME = " + time.asctime()


My code for accessing a FeatureLayer was:

import arcpy
import time


print "START TIME = " + time.asctime()


# Source dataset to update
fc = r"C:\Scratch\TestLayer.shp"
fl = "TestLayer"
arcpy.MakeFeatureLayer_management(fc,fl)


# Create a list of numbers from 1 to 50
l = range(1,50,1)


for i in l:
    # Create a field name and add it to dataset
    name = "F_" + str(i)
    print "Processing "  + name
    arcpy.AddField_management(fl,name,"LONG")


    # Populate field with a constant 999
    arcpy.CalculateField_management(fl,name,"999","VB")


print "END TIME = " + time.asctime()


For the 10 test runs the mean time for running the code for:

  • full path name was 22 seconds

  • FeatureLayer was 19.5 seconds

I did a Two-Sample T-Test in Minitab which is significant which means I reject my null hypothesis.

Two-sample T for fl vs fc

   N  Mean   StDev  SE Mean
fl  10  19.500  0.527     0.17
fc  10  22.000  0.816     0.26

Difference = mu (fl) - mu (fc)
Estimate for difference:  -2.500
95% CI for difference:  (-3.155, -1.845)
T-Test of difference = 0 (vs not =): T-Value = -8.13  P-Value = 0.000  DF = 15

So for this simple scenario there was a significant difference in performance by creating a FeatureLayer first...

Interesting hey?

Outcomes