Compare value from n and n+1 row in searchCursor

ZdeněkSoldán · ‎07-27-2016

Hello,

I have a searchCursor that goes through a specific column in a table. On every row I need to compare the value from this row with the value on the next row and if they are equal do something and if they aren't equal do something else.

mxd = arcpy.mapping.MapDocument("CURRENT")  
table = arcpy.mapping.ListTableViews(mxd)[0]  
column = "name"  
cursor=arcpy.SearchCursor(table)  
for row in cursor:  
    print(row.getValue(column))

I have this sample of code. It goes through all rows. But how can I compare the value from row[row] with the value from row[row+1]??

Thanks for any help.

JenniferMcCall4 · ‎07-27-2016

You could create a counter parameter at the beginning of the script:

counter = 0

Then enter the For loop. At the bottom of the loop, increase the counter. If the counter is equal to 0 then don't compare. If the counter is greater than 0, compare the values:

column = "name"

cursor = arcpy.SearchCursor(table)
counter = 0
for row in cursor:
     n = row.getValue(column)
     if counter > 0:
          ### Compare n and oldN in here
     oldN = n
     counter = counter + 1
     print(row.getValue(column))

View solution in original post

JenniferMcCall4 · ‎07-27-2016

Hi Zdeněk,

At the end of the for statement, can you make a new parameter (called oldN or something) to store the value from the loop. Then the next time the loop occurs compare n to oldN before the oldN parameter is reset at the end of the loop.

ZdeněkSoldán · ‎07-27-2016

Hello Jennifer,

Thanks for quick response. But I am not sure how to make first iterate step because I will not have what to compare.

JenniferMcCall4 · ‎07-27-2016

You could create a counter parameter at the beginning of the script:

counter = 0

Then enter the For loop. At the bottom of the loop, increase the counter. If the counter is equal to 0 then don't compare. If the counter is greater than 0, compare the values:

column = "name"

cursor = arcpy.SearchCursor(table)
counter = 0
for row in cursor:
     n = row.getValue(column)
     if counter > 0:
          ### Compare n and oldN in here
     oldN = n
     counter = counter + 1
     print(row.getValue(column))

JoshuaBixby · ‎07-28-2016

This code is backwards looking, not forwards looking. Although I think structuring the data to support backwards looking is the best approach, if possible, the OP was asking about looking ahead in the data set.

XanderBakker · ‎07-28-2016

Just a few comments on the provided solution:

If you have access to ArcGIS 10.1 or higher, use the cursors provided in the data access module for faster access
Cursors normally do not provide access to the values on the next 'row'. You can solve this by creating a list of values using list comprehensions before performing the update cursor
I added an additional field to write the resulting values to. There may be additional things to keep in mind when overwriting the existing values.

What I would do is this:

def main():
    import arcpy

    # access data and define fields to use
    mxd = arcpy.mapping.MapDocument("CURRENT")
    table = arcpy.mapping.ListTableViews(mxd)[0]
    fld_in = "name"
    fld_out = "another field"  # I added a second field for output values

    # create a list of values from the field using list comprehensions
    lst_values = [r[0] for r in arcpy.da.SearchCursor(table, (fld))]

    # use a with statement when working with cursors
    # use the da module for faster access
    # use UpdateCursor to update values
    index = 0
    with arcpy.da.UpdateCursor(table, (fld_in, fld_out)) as curs:
        for row in curs:
            name = row[0]
            if index == len(lst_values) - 1:
                # last row, no 'next' value
                next_name = None
            else:
                next_name = lst_values[index + 1]

            # compare the values
            if name == next_name:
                # they are the same, do something with this
                out_value = 'some value you define'
            else:
                # they are different
                out_value = 'some other value you define'

            # update the values
            curs.updateRow((name, out_value, ))

            # increment index
            index += 1


if __name__ == '__main__':
    main()

JoshuaBixby · ‎07-28-2016

As pointed out, cursors can't predict the future, i.e., they have no idea what is in the next row until they visit the row, which then makes it the current row and not the next row. Looking ahead with cursors involves iterating over the data set twice, either serially or in parallel.

Xander Bakker's code is serial, i.e., it loops through the data once, completely, to build a local lookup structure for the second loop. The approach is straightforward and will work in a majority, possibly vast majority, of cases. When working with very large data sets, especially with 32-bit Python, the size of the local copy of data may become an issue.

An alternate approach that doesn't involve making local copies of data is to use two cursors in parallel where one increments ahead of the other as both go through the data set. This eliminates the local copy of data, but it might create an I/O bottleneck with the back-end data provider depending on the data set and conditions. Adapting from Xander's example:

# access data and define fields to use
mxd = arcpy.mapping.MapDocument("CURRENT")
table = arcpy.mapping.ListTableViews(mxd)[0]
fld_in = "name"
fld_out = "another field"  # I added a second field for output values

# use a with statement when working with cursors
# use the da module for faster access

# use UpdateCursor to update values
with arcpy.da.UpdateCursor(table, (fld_in, fld_out)) as u_cur:

    # use Search Cursor in parallel to get "next" value
    with arcpy.da.SearchCursor(table, fld_in) as s_cur:
        next(s_cur)
        for s_fld_in, in s_cur:
            u_fld_in, u_fld_out = next(u_cur)

            # compare values and do one thing or another
            if s_fld_in == u_fld_in:
                u_fld_out = # some value
            else:
                u_fld_out = # different value

            # update the values
            u_cur.updateRow((u_fld_in, u_fld_out))

With a multiple, parallel cursor approach, it is a good idea to include an "ORDER BY" statement in the cursor definition to ensure the two cursors are iterating over the dataset in the same order.

Overall, I recommend structuring the data so you can look backwards instead of looking forwards. Looking backwards involves a single cursor and single iteration, which is simpler to code and performs better.