I have an original table with values like this:
Line ID Year Code
0001 2002 300
0001 2003 300
0001 2004 500
I want to find the year the code changes for each line and add that into a table that only shows the year the lines that had code changes. Example:
Line ID Year_Change
0001 2004
I was thinking of creating 2 dictionary’s {LID: Year} and {LID: Code} and inserting the values into a new table like this:
Line ID Code2002 Code2003 Code2004
0001 300 300 500
Then using if statements, such as
If row[1] = row[2]
Next
Elif: row[2] = row[3]
next
Else:
Create dictionary entry {LID: YearChange} which will be inserted into a table.
I think I am in a coding rut and need fresh ideas! I’d like to work with the original table. Anyone know of another way to create a loop comparing a previous record based on a 2 variable (LID and Year) expression?
Solved! Go to Solution.
How about something like the code below?
# input table (must be geodatabase to allow sorting)
tbl = r'C:\TEMP\tdb.gdb\sorting'
# add column to record the change
arcpy.management.AddField(tbl, "changed", "SHORT")
orderby = 'ORDER BY line_id, theyear, thecode'
last_id, last_code = None, None
with arcpy.da.UpdateCursor(tbl, ['line_id', 'theyear', 'thecode', 'changed'], sql_clause=(None, orderby)) as uc:
for row in uc:
this_id, this_year, this_code, changed = row
if this_id == last_id:
if last_code != this_code:
newrow = [this_id, this_year, this_code, 1] # change!
uc.updateRow(newrow) #save
last_id, last_code = this_id, this_code
Hi Amy,
I wonder whether the Summary Statistics tool can help in your case. Probably not but it sounds like something there should be a tool for.
Anyhow, in Python, I would approach this with a search cursor (or other cursor depending on how you want to store the result). The key hint is that you can sort rows before you start searching them using the sql_clause parameter of arcpy.da.SearchCursor.
In principle, I would first initialize variables to remember the last_line_id, last_code. Then, call a search cursor on
the initial table, order records by Line ID Ascending, Year Ascending, and Code Ascending.
Loop through each row and check whether your current_line_id is the same as last_line_id and if current_code is different from last_code.
If the above condition it true, you know the code has changed so you have to remember that year for current_line_id.
Don't forget to set last_line_id = current_line_id and last_code = current_code by the end of each iteration. If current_line_id is different from last_line_id, you will want to re-initialize the last_code variable and make sure you don't count it as a fake change of code for the previous line.
We used this approach to solve a similar problem some time ago. Maybe you can adapt the code from there: Python to select younger of two polygons with similar attributes
If you want some code sample specifically for your situation, please explain exactly how you want the result to be structured.
Thanks Filip, great idea to delete the object ID's. I modified the code link you provided and got everything to work, one minor problem.....the first record does not make it into the delete object ID list (it has nothing to compare itself to?). How can I fix this?
import arcpy tbl = r"U:\Projects\Change.gdb\qryTable" flds = ['OBJECTID', 'PLIDS', 'CYEAR', 'CNUMBER'] orderby = 'ORDER BY PLIDS, CYEAR' objdel = [] last_id = None with arcpy.da.SearchCursor(tbl, flds, "", "","", sql_clause=(None, orderby)) as sc: for row in sc: this_id = row[3] if this_id == last_id: this_object_id = row[0] objdel.append(this_object_id) last_id = this_id print objdel del row, sc for l in objdel: expression = "OBJECTID = %s"%(l) uc = arcpy.da.UpdateCursor(tbl, "*", where_clause=expression) for u in uc: uc.deleteRow() del l
Just have to use a row counter, so that on the first low, set Last_id = This_Id, something like...
import arcpy
Phew, still can't get this python highlighting right....
So the indentations are all messed up. And when I try to edit it, I see a bunch of overlapping lines.
Sorry.
I should have told you I have multiple line id's (the table I posted was just a snippet) so just counting the first row will work only for the first line. Right now I am trying to figure out a way to select for only 1 line ID at a time and use the code above. No such luck yet, not sure that is the best way to go
How about something like the code below?
# input table (must be geodatabase to allow sorting)
tbl = r'C:\TEMP\tdb.gdb\sorting'
# add column to record the change
arcpy.management.AddField(tbl, "changed", "SHORT")
orderby = 'ORDER BY line_id, theyear, thecode'
last_id, last_code = None, None
with arcpy.da.UpdateCursor(tbl, ['line_id', 'theyear', 'thecode', 'changed'], sql_clause=(None, orderby)) as uc:
for row in uc:
this_id, this_year, this_code, changed = row
if this_id == last_id:
if last_code != this_code:
newrow = [this_id, this_year, this_code, 1] # change!
uc.updateRow(newrow) #save
last_id, last_code = this_id, this_code
Brilliant! Thanks