Using python to calculate attribute of point feature based on max of related record attribute

KathleenHoenke · ‎03-15-2023

I have a point feature class with a related table (that has a relationship class, a one to many relationship). I want to calculate a field in the point feature based on the mimumum record (that shares the joinkey) in an attribute of the related table.

For example, I want it to be: if CrossingTypeID in main fc == "culvert", calculate OutletDrop in main fc with the minimum value of the OutletDrop_feet field in the related table associated with the matching ParentglobalID_txt.

How do I go about this in python?

Thank you!

KimOllivier · ‎03-15-2023

I am afraid that relationship classes are not supported in arcpy. You will have to create your own equivalent, which as it turns out is much better and faster anyway.

1. Read in the related tables using a SearchCursor inside a list comprehension to get an in_memory list of records with the key as the one of the fields.

1a. Convert the list to a Pandas dataframe.

2. Use Pandas to find the minimum or maximum values with a groupby of the foreign key

3. Make a dictionary of the results of the statistics

4. Run an UpdateCursor on your featureclass and use the dictionary of min/max values to update the featureclass.

This will be lightning fast, reliable and easy to understand. You might add some error trapping such as use .get(value,none) instead of a lookup to avoid missing values. See example.

KathleenHoenke · ‎03-16-2023

Honestly I don't know where to start, so an example would be wonderful. Thank you!

KimOllivier · ‎03-17-2023

Here is an example of finding the latest visit record (using Pandas) and then transferring that to the parent location table.

# current_status.py
# from Visits_Table put back status, edit date, difficulty on WeedLocations
# use Pandas for ease, speed, simplicity
# 19 Sept 2022 latest schema, different gdb
# 12 Oct 2022 change to DateCheck from EditDate but keep EditDate as last record
import sys
import arcpy
import pandas as pd
import collections
from datetime import datetime
try:
    gdb = sys.argv[1]
except IndexError:
    disk = sys.argv[0][0]
    gdb = '{}:/project/econet/source/cams_weed.gdb'.format(disk)
start = datetime.now()
if not arcpy.Exists(gdb):
    raise IOError
arcpy.env.workspace = gdb
arcpy.env.overwriteOutput = True
arcpy.AddMessage(gdb)
debug = True
# two tables in gdb
weeds = 'WeedLocations'
visits = 'Visits_Table'

# basic attributes to be transferred from Visits to WeedLocations, not validated yet
visit_to_weed = {
    'Guid_visits': 'GlobalID',                          ## 0 foreign key -> primary key
    'DateCheck':'DateVisitMadeFromLastVisit',            ## 1 for latest date Note not the same as EditDate
    "WeedVisitStatus":'StatusFromLastVisit',            ## 2 as inspected
    'DifficultyChild':'DifficultyFromLastVisit',        ## 3 as inspected
    'VisitStage':'LatestVisitStage',                    ## 4 as inspected
    'Area':'LatestArea',                                ## 5 as inspected
    'DateForReturnVisit':'DateForNextVisitFromLastVisit', # 6 calculated
    'EditDate':'EditDate'                               ## dummy for pandas to find latest record
    }
in_flds = list(visit_to_weed.keys())
out_flds = list(visit_to_weed.values())
filter = '' #"""EditDate > date '{}'""".format('2022-07-01')
vdate =[row for row in arcpy.da.SearchCursor(visits,in_flds,filter)]
print('vdate:',len(vdate))
# put in a pandas dataframe and process woohoo
df = pd.DataFrame(vdate,columns=in_flds)
# find the record with max edit date by visit and keep the other details all in one line!
idx = df.groupby(['Guid_visits'])['EditDate'].transform(max) == df['EditDate']
dVisit = df.set_index('Guid_visits').T.to_dict('list')
# Count visits for each location
vguid = [row[0] for row in arcpy.da.SearchCursor(visits,['Guid_visits'], "Guid_visits is not NULL")]
# dict of counts by GlobalID for updating
vguid_counts = collections.Counter(vguid)
# update weeds with visit count and latest details
with arcpy.da.UpdateCursor(weeds, ['VisitCount'] + out_flds) as cur:
    n = 0
    for row in cur:
        try:
            row[1] = vguid_counts[row[0]]
            if dVisit.get(row[0],None):
                row[2] = dVisit.get(row[0],None)[0]
                row[3] = dVisit.get(row[0],None)[1]
                row[4] = dVisit.get(row[0],None)[2]
                row[5] = dVisit.get(row[0],None)[3]
                row[6] = dVisit.get(row[0],None)[4]
            cur.updateRow(row)
            n+=1
        except Exception as e:
            arcpy.AddMessage(row)
            arcpy.AddMessage(e)
print("Well Done, {} records updated in {}".format(n, datetime.now() - start))