performance of SearchCursor and da.SearchCursor

1567
4
Jump to solution
09-23-2013 04:56 PM
ZachLiu1
Occasional Contributor II
I set up a comparison between SearchCursor and da.SearchCursor, the code is like this:

import arcpy, time  shp = r"C:\Users\THINK\Desktop\python\Data\TOWNS.shp"  #da SearchCursor sTime = time.clock() rows = arcpy.da.SearchCursor(shp, ["TOWN", "TOTAL_SQMI"]) for row in rows:     town = row[0]     area = row[1]     print ("The area of %s is %s sq miles."%(town, area))    del row, rows  eTime = time.clock() tDiff = eTime - sTime print "da SearchCursor uses %s seconds."%(tDiff)    #SearchCursor sTime = time.clock() rows = arcpy.SearchCursor(shp) for row in rows:     town = row.getValue("TOWN")     area = row.getValue("TOTAL_SQMI")     print ("The area of %s is %s sq miles."%(town, area)) del row, rows  eTime = time.clock() tDiff = eTime - sTime print "SearchCursor uses %s seconds."%(tDiff)


I assume da.SearchCursor will be faster but it took more than 2 seconds while SearchCursor finished in just 0.5 second.
Is there something wrong with the test?
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
Luke_Pinner
MVP Regular Contributor
A few suggestions:

  • Take out the print statement from inside the loop, that is adding a big overhead. 

  • There's overhead in searchcursor setup, take that out of the timing

  • Pass a field list to arcpy.SearchCursor as you are doing with da.SearchCursor i.e. arcpy.SearchCursor(shp, "", "", "TOWN;TOTAL_SQMI")

  • In case there's some caching going on behind the scenes, put the tests in separate scripts (one that times the arcpy cursor and one that times the da searchcursor)

  • Use the timeit module instead, example below:

if __name__ == '__main__':     import timeit     shp = r"D:\TEMP\test.shp"     print(timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp))     print(timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s", "", "", "PNTID")'%shp))



EDIT:Some performance tests on 1000 points, getting value of a single field.

* Printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is included in timing
1. da.SearchCursor:    4.1257697098717925
2. arcpy.SearchCursor: 6.2084034116828475

* Not printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is included in timing
3. da.SearchCursor:    0.07970193086012323
4. arcpy.SearchCursor: 2.196710119083276

* Not printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is not included in timing
5. da.SearchCursor: 0.00738222613381
6. arcpy.SearchCursor: 0.160204482864

* Not printing field value in for loop
* Passing field list to arcpy.SearchCursor
* searchcursor setup is not included in timing
7. da.SearchCursor: 0.0075200788871
8. arcpy.SearchCursor: 0.118301085773

The above was generated using the following code:
if __name__ == '__main__':     import timeit     shp = r"D:\TEMP\test.shp"     s=''' rows = arcpy.da.SearchCursor(r"%s", ["PNTID"]) for row in rows:     print row[0]'''%shp     a=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.SearchCursor(r"%s") for row in rows:     print row.getValue("PNTID")'''%shp     b=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.da.SearchCursor(r"%s", ["PNTID"]) for row in rows:     id=row[0]'''%shp     c=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.SearchCursor(r"%s") for row in rows:     id=row.getValue("PNTID")'''%shp     d=timeit.timeit(s,number=10, setup='import arcpy')      e=timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp)     f=timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s")'%shp)      g=timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp)     h=timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s", "", "", "PNTID")'%shp)  print '1. da.SearchCursor:',a print '2. arcpy.SearchCursor:',b print '3. da.SearchCursor:',c print '4. arcpy.SearchCursor:',d print '5. da.SearchCursor:',e print '6. arcpy.SearchCursor:',f print '7. da.SearchCursor:',g print '8. arcpy.SearchCursor:',h

View solution in original post

0 Kudos
4 Replies
Luke_Pinner
MVP Regular Contributor
A few suggestions:

  • Take out the print statement from inside the loop, that is adding a big overhead. 

  • There's overhead in searchcursor setup, take that out of the timing

  • Pass a field list to arcpy.SearchCursor as you are doing with da.SearchCursor i.e. arcpy.SearchCursor(shp, "", "", "TOWN;TOTAL_SQMI")

  • In case there's some caching going on behind the scenes, put the tests in separate scripts (one that times the arcpy cursor and one that times the da searchcursor)

  • Use the timeit module instead, example below:

if __name__ == '__main__':     import timeit     shp = r"D:\TEMP\test.shp"     print(timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp))     print(timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s", "", "", "PNTID")'%shp))



EDIT:Some performance tests on 1000 points, getting value of a single field.

* Printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is included in timing
1. da.SearchCursor:    4.1257697098717925
2. arcpy.SearchCursor: 6.2084034116828475

* Not printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is included in timing
3. da.SearchCursor:    0.07970193086012323
4. arcpy.SearchCursor: 2.196710119083276

* Not printing field value in for loop
* Not passing field list to arcpy.SearchCursor
* searchcursor setup is not included in timing
5. da.SearchCursor: 0.00738222613381
6. arcpy.SearchCursor: 0.160204482864

* Not printing field value in for loop
* Passing field list to arcpy.SearchCursor
* searchcursor setup is not included in timing
7. da.SearchCursor: 0.0075200788871
8. arcpy.SearchCursor: 0.118301085773

The above was generated using the following code:
if __name__ == '__main__':     import timeit     shp = r"D:\TEMP\test.shp"     s=''' rows = arcpy.da.SearchCursor(r"%s", ["PNTID"]) for row in rows:     print row[0]'''%shp     a=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.SearchCursor(r"%s") for row in rows:     print row.getValue("PNTID")'''%shp     b=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.da.SearchCursor(r"%s", ["PNTID"]) for row in rows:     id=row[0]'''%shp     c=timeit.timeit(s,number=10,setup='import arcpy')      s=''' rows = arcpy.SearchCursor(r"%s") for row in rows:     id=row.getValue("PNTID")'''%shp     d=timeit.timeit(s,number=10, setup='import arcpy')      e=timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp)     f=timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s")'%shp)      g=timeit.timeit('for row in rows:id = row[0]',                         number=10,#run loop 10x and return average                         setup='import arcpy; rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp)     h=timeit.timeit('for row in rows:id = row.getValue("PNTID")',                         number=10, #run loop 10x and return average                         setup='import arcpy; rows = arcpy.SearchCursor(r"%s", "", "", "PNTID")'%shp)  print '1. da.SearchCursor:',a print '2. arcpy.SearchCursor:',b print '3. da.SearchCursor:',c print '4. arcpy.SearchCursor:',d print '5. da.SearchCursor:',e print '6. arcpy.SearchCursor:',f print '7. da.SearchCursor:',g print '8. arcpy.SearchCursor:',h

View solution in original post

0 Kudos
ZachLiu1
Occasional Contributor II
This is great response, thanks!
0 Kudos
StacyRendall1
Occasional Contributor III
I would also recommend repeating a bunch of times and averaging - with (non-Arc) database applications I have seen a lot of variability.

Cursor setup time should be noted as well! Most of the time users are probably doing small calculations, if da.SearchCursor takes a lot more time to set up, the increased speed of its operation may well be pointless...
0 Kudos
Luke_Pinner
MVP Regular Contributor
The timeit code I posted runs each loop 10 times and prints the average - i.e. "number=10". The timeit default number of repetitions is 10,000. I just left it at 10 as the first set took so long.

SearchCursor setup overhead is relevant, but not when timing loops, it should be timed separately.  Anyway, da searchcursors setup is much faster than arcpy searchcursors

if __name__ == '__main__':
    import timeit
    shp = r"D:\TEMP\test.shp"

    s='rows = arcpy.da.SearchCursor(r"%s", ["PNTID"])'%shp
    c=timeit.timeit(s,number=100,setup='import arcpy')

    s='rows = arcpy.SearchCursor(r"%s", "", "", "PNTID")'%shp
    d=timeit.timeit(s,number=100,setup='import arcpy')

    print '1. da.SearchCursor:',c
    print '2. arcpy.SearchCursor:',d

1. da.SearchCursor: 0.032812795193
2. arcpy.SearchCursor: 0.497164226788
0 Kudos