Python Notebook to update attributes to a feature layer

SamanthaAPCC — Thu, 02 May 2024 16:13:49 GMT

I am using AGOL Python Notebook to write to a hosted feature layer using attributes from two Survey123 surveys. Eventually I will need to schedule this code to run daily. The code seems to be working and is printing my final message, but it is not writing to the attribute table of my cyanoponds layer. Am I not using the correct python language? I am very novice at python but this is our only option for the type of updates we need. I am taking data from a survey and based on a unique ID (CCC_GIS_ID) I am looking for the most recent sample as well as the one with the highest cyano_status ranking. It also appears to be skipping CCC_GIS_IDs when there is more than 1 sample at the pond location, which means I'm not incorporating lines 46-57 correctly. I will also need to add a component that changes the status to "No Updated Information" if the most recent sample is older than 30 days, which I haven't quite figured out how to add in yet.

Also, when this code is running daily it will need to overwrite the table with the new information.

I took elements from this article in order to come up with the basis of this code.

from arcgis.gis import GIS from arcgis.features import FeatureLayer import pandas as pd from arcgis.features import FeatureCollection from datetime import datetime, timedelta gis = GIS("home") def update_CyanoData(): one_day = datetime.today() - timedelta(days=1) string_day = one_day.strftime('%Y-%m-%d') where_query = f"fluorometry_date >= DATE '{string_day}'" where_querymf = f"microscopy_date >= DATE '{string_day}'" days_valid = 30 #How do I incorporate this so that the Cyano_Status field changes to 'No Updated Information' if all samples for that # pond are older than 30 days? cyanoponds = gis.content.get('a13f1b2e80ff4580a27620fc9d47f757') cyanoponds_lyr = cyanoponds.layers[0] cyanoponds_fset = cyanoponds_lyr.query() cyanoponds_features = cyanoponds_fset.features cyanoponds_sdf = cyanoponds_fset.sdf fluorsamples = gis.content.get('db342bf34b1742c0abb96d6ebff2f396') #Public facing View of the Fluorometry survey, final_review = yes fluorsamples_lyr = fluorsamples.layers[0] fluorsamples_fset = fluorsamples_lyr.query(where=where_query) fluorsamples_features = fluorsamples_fset.features fluorsamples_sdf = fluorsamples_fset.sdf #I will maybe need to use this survey to populate dominant genus, scum notes & pond temperatures if I can't do that with arcade and UniqueIDs in pop-ups microfieldsamples = gis.content.get('dc204479b5bf4f34a2efed96d0b2a340') #View of Joined Microscopy & Field samples microfieldsamples_lyr = microfieldsamples.layers[0] microfieldsamples_fset = microfieldsamples_lyr.query(where=where_querymf) microfieldsamples_features = microfieldsamples_fset.features microfieldsamples_sdf = microfieldsamples_fset.sdf df = fluorsamples_sdf.sort_values(by=['fluorometry_date'], ascending=False) # Add cyano status rating to enable easy selection of the worst status for each pond later. This is relevant for cases where there # is more than one sampling location at a single pond. There are only a few ponds like this, but it won't cause any issues if # we add the rating for all ponds. df['cyano_rate'] = 0 df.loc[df.cyano_risk_tier=="Acceptable", "cyano_rate"] = 1 df.loc[df.cyano_risk_tier=="Potential for Concern", "cyano_rate"] = 2 df.loc[df.cyano_risk_tier=="Use Restriction Warranted", "cyano_rate"] = 3 # SELECT ONE MOST RECENT RECORD FOR EACH POND. First, create a function (get_maxdate) to get the most recent record from a group. def get_maxdate(group): return group[group.fluorometry_date==group.fluorometry_date.max()] # Group by pond ID and apply get_maxdate to get most recent record(s) for each pond. recent_pre = df.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True) # For ponds with more than one sample location, we'll assign them the worst worst status found at the pond that day. # First, create a function (get_worststatus) that gets the worst cyano status (based on cyano_rate) from a group. def get_worststatus(group): return group[group.cyano_rate==group.cyano_rate.max()] # Then, group by pond ID and apply get_worststatus function. recent_pre = df.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True) # Among the ponds with multiple sample locations, if a pond has the same cyano status at more than one location, it will have # subsetted more than one record for the pond. This code simply takes the first listed one. recent_pre.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True) df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%d/%Y') #### FIND THE SECOND MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent records for # each pond. Apply the above code to this subset, thereby finding the SECOND most recent record for each pond and collecting # them in a data frame (secondmostrecent). excludingrecent= df[~df.sampleID.isin(recent_pre.sampleID)] secmostrecent = excludingrecent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True) if secmostrecent.empty: secmostrecent=excludingrecent secmostrecent = secmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True) if secmostrecent.empty: secmostrecent=excludingrecent secmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True) # Create a field that lists the date and cyano_status from the second most recent record secmostrecent["secprior_sample"] = secmostrecent["date_str"] + ": " + secmostrecent["cyano_risk_tier"] secmostrecent = secmostrecent.loc[:,["CCC_GIS_ID","secprior_sample","sampleID"]] secmostrecent.rename(columns={"sampleID":"secsampleID"},inplace=True) #### FIND THE THIRD MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent and 2nd most recent # records for each pond. Apply the above code to this subset, thereby finding the THIRD most recent record for each pond and # collecting them in a data frame (thirdmost recent). excluding2recent = excludingrecent[~excludingrecent.sampleID.isin(secmostrecent.secsampleID)] thirdmostrecent = excluding2recent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True) if thirdmostrecent.empty: thirdmostrecent=excluding2recent thirdmostrecent = thirdmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True) if thirdmostrecent.empty: thirdmostrecent=excluding2recent thirdmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True) # Create a field that lists the date and cyano_status from the third most recent record thirdmostrecent["thirdprior_sample"] = thirdmostrecent["date_str"] + ": " + thirdmostrecent["cyano_risk_tier"] thirdmostrecent = thirdmostrecent.loc[:,["CCC_GIS_ID","thirdprior_sample","sampleID"]] thirdmostrecent.rename(columns={"sampleID":"thirdsampleID"},inplace=True) #### CREATE PRIOR ACTIVITY FIELD # Merge data frames recent_pre, secmostrecent, and thirdmostrecent together into "recent." Create field called prior_activity that # lists sample dates and statuses from the second and third most recent sampling dates for a pond. recent = pd.merge(recent_pre,secmostrecent,on="CCC_GIS_ID",how="outer"); recent = pd.merge(recent,thirdmostrecent,on="CCC_GIS_ID",how="outer") recent["prior_activity"] = recent["secprior_sample"] + ", " + recent["thirdprior_sample"].fillna("") recent["prior_activity"] = recent["prior_activity"].str.strip().str.rstrip(',') # <-- remove hanging comma and space at end, in cases where there is only 2nd most recent recent = recent.sort_values(by="fluorometry_date", ascending=False) overlap_rows = pd.merge(left = cyanoponds_sdf, right = recent, how = 'inner', on='CCC_GIS_ID' ) cyanopond_features = cyanoponds_fset.features fluorsamples_updates = fluorsamples_fset.features fluorsamples_updates.reverse() def update(cyanopond, fluorometry): for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']: try: cyanopond_feature = [f for f in cyanopond_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0] fluorsample_feature = [f for f in fluorsamples_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0] cyanopond_feature.attributes['Sample_Date'] = fluorsample_feature.attributes['fluorometry_date'] cyanopond_feature.attributes['Cyano_Status']= fluorsample_feature.attributes['cyano_risk_tier'] #cyanopond_feature.attributes['Recent_Activity']= I need this to populate with the 'prior_activity' column that we wrote to the 'recent' data frame cyanoponds_lyr.edit_features(updates=[cyanopond_feature]) print(f"Updated {cyanopond_feature.attributes['CCC_GIS_ID']} cyano status to {cyanopond_feature.attributes['Cyano_Status']}",flush=True) except Exception as e: # Print exception details for debugging. print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}") update(cyanoponds_features, fluorsamples_updates) update_CyanoData()

Re: Python Notebook to update attributes to a feature layer

AngelaSchirck — Fri, 03 May 2024 11:58:23 GMT

Just FYI you might get more response on this in the Developer's (Python) community.

Have you tried using arcpy api (Update cursor)?

Re: Python Notebook to update attributes to a feature layer

SamanthaAPCC — Fri, 03 May 2024 13:19:45 GMT

Thanks Angela, I'll try posting there. I'm not a developer and my experience with python is limited so I didn't even think to post there. I'm trying to combine two codes (one of which is from the article I posted) so I'll look into the Update cursor. I was trying to avoid python altogether for this project but the abilities of using Arcade for symbology is too limited for what I need to accomplish. Unfortunately this is my only option!

topic Re: Python Notebook to update attributes to a feature layer in ArcGIS Online Questions

Python Notebook to update attributes to a feature layer

Re: Python Notebook to update attributes to a feature layer

Re: Python Notebook to update attributes to a feature layer