Enumeration of a cursor

BlakeTerhune · ‎04-09-2015

Xander, I recently unlocked the mysteries of enumeration and noticed your cnt variable does something similar. Because I'm still learning, I thought something like this would work instead:

with arcpy.da.SearchCursor(fc_house, flds_house) as curs_in:
    for cnt, row_in in enumerate([i for i in curs_in], start=1):
        curs_in.reset()  ## Return cursor back to the first row after enumeration
        if cnt % 25 == 0:
            print "Processing connection: {0}".format(cnt)

        pnt1 = row_in[0]
        parcel_id = row_in[1]
        date_sale = row_in[2]
        fire_oid = row_in[3]
        # Continue processing rows in cursor...

After some tinkering, I realized that when you enumerate a cursor like this, it goes through all the rows. The problem is that the cursor object ends with no more rows and you have to call reset() on the cursor to start it back at the first row again. Since there is still more processing to be done, you'll essentially be iterating over all rows in the cursor twice with enumeration instead of once with your original counter variable. Do you think the extra time it takes to create the enumeration of the cursor is worth it in a case like this? Maybe only for cursors with a small number of rows?

JoshuaBixby · ‎04-10-2015

The enumerate function isn't exhausting the cursor, at least not the first time, the list comprehension is doing it and requiring the cursor to be reset. Try restructuring your code as follows:

with arcpy.da.SearchCursor(fc_house, flds_house) as curs_in:
   for cnt, row_in in enumerate(curs_in, start=1):
       if cnt % 25 == 0:
           print "Processing connection: {0}".format(cnt)
       pnt1 = row_in[0]
       parcel_id = row_in[1]
       date_sale = row_in[2]
       fire_oid = row_in[3]
       # Continue processing rows in cursor...

The enumerate function operates against "a sequence, an iterator, or some other object which supports iteration." The data access cursors support iteration, which is also what allows us to use them with a for statement.

Whereas generator expressions are evaluated lazily, comprehensions are not, which means your list comprehension needs to fully iterate over the cursor before returning any and all values in a list. It is for this reason you are having to reset your cursor before doing any processing. Since cursors already support iteration, just pass them directly to enumerate and don't bother building a list of the entire cursor.

View solution in original post

XanderBakker · ‎04-09-2015

Hi Blake, interesting question...

I have branched it to a new thread instead of keeping it as a remark of the thread add time constraint to near tool.

I am not sure what the benefit would be of using enumeration. If I loop through the cursor using the standard method (for row in cursor) and keep track of a counter, that is little overhead, I think...

Maybe, Dan Patterson, Joshua Bixby or Jake Skinner want to jump in and give their opinion on this?

DanPatterson_Retired · ‎04-09-2015

I am not one of those Pythonista's that worries about time to execute...unless we are talking minutes versus seconds. I address those issues in the following way:

What IS the difference in processing times?
If two methods are offered, what is the easiest to conceptually understand and explain
as for the above, which "looks" better ( a classic example is some of the fancy indexing using numpy strides...works great...looks totally bizarre)
does one method facilitate the addition of complementary information at the same time if it is needed later
do both yield the same results...and this is the key...if they do, they you have added to your arsenal of tools

So in short, I am worried less about one versus the other but learning/presenting more ways to do the same thing

JoshuaBixby · ‎04-10-2015

The enumerate function isn't exhausting the cursor, at least not the first time, the list comprehension is doing it and requiring the cursor to be reset. Try restructuring your code as follows:

with arcpy.da.SearchCursor(fc_house, flds_house) as curs_in:
   for cnt, row_in in enumerate(curs_in, start=1):
       if cnt % 25 == 0:
           print "Processing connection: {0}".format(cnt)
       pnt1 = row_in[0]
       parcel_id = row_in[1]
       date_sale = row_in[2]
       fire_oid = row_in[3]
       # Continue processing rows in cursor...

The enumerate function operates against "a sequence, an iterator, or some other object which supports iteration." The data access cursors support iteration, which is also what allows us to use them with a for statement.

Whereas generator expressions are evaluated lazily, comprehensions are not, which means your list comprehension needs to fully iterate over the cursor before returning any and all values in a list. It is for this reason you are having to reset your cursor before doing any processing. Since cursors already support iteration, just pass them directly to enumerate and don't bother building a list of the entire cursor.

BlakeTerhune · ‎04-13-2015

Nailed it. Thanks Joshua!