Let me preface by saying that my experience with python is very limited and I am using elements from 2 codes to try to make this work. I am using AGOL Python Notebook to write to a hosted feature layer using attributes from two Survey123 surveys. Eventually I will need to schedule this code to run daily. The code seems to be working and is printing my final message, but it is not writing to the attribute table of my cyanoponds layer. Am I not using the correct python language? I am very novice at python but this is our only option for the type of updates we need. I am taking data from a survey and based on a unique ID (CCC_GIS_ID) I am looking for the most recent sample as well as the one with the highest cyano_status ranking. It also appears to be skipping CCC_GIS_IDs when there is more than 1 sample at the pond location, which means I'm not incorporating lines 46-57 correctly. I will also need to add a component that changes the status to "No Updated Information" if the most recent sample is older than 30 days, which I haven't quite figured out how to add in yet.
Also, when this code is running daily it will need to overwrite the table with the new information.
I took elements from this article in order to come up with the basis of this code. The rest I wrote or used from a code that was handed down to me for this project back when it used to be run on a local machine. Any help is greatly appreciated!!
from arcgis.gis import GIS
from arcgis.features import FeatureLayer
import pandas as pd
from arcgis.features import FeatureCollection
from datetime import datetime, timedelta
gis = GIS("home")
def update_CyanoData():
one_day = datetime.today() - timedelta(days=1)
string_day = one_day.strftime('%Y-%m-%d')
where_query = f"fluorometry_date >= DATE '{string_day}'"
where_querymf = f"microscopy_date >= DATE '{string_day}'"
days_valid = 30 #How do I incorporate this so that the Cyano_Status field changes to 'No Updated Information' if all samples for that
# pond are older than 30 days?
cyanoponds = gis.content.get('a13f1b2e80ff4580a27620fc9d47f757')
cyanoponds_lyr = cyanoponds.layers[0]
cyanoponds_fset = cyanoponds_lyr.query()
cyanoponds_features = cyanoponds_fset.features
cyanoponds_sdf = cyanoponds_fset.sdf
fluorsamples = gis.content.get('db342bf34b1742c0abb96d6ebff2f396') #Public facing View of the Fluorometry survey, final_review = yes
fluorsamples_lyr = fluorsamples.layers[0]
fluorsamples_fset = fluorsamples_lyr.query(where=where_query)
fluorsamples_features = fluorsamples_fset.features
fluorsamples_sdf = fluorsamples_fset.sdf
#I will maybe need to use this survey to populate dominant genus, scum notes & pond temperatures if I can't do that with arcade and UniqueIDs in pop-ups
microfieldsamples = gis.content.get('dc204479b5bf4f34a2efed96d0b2a340') #View of Joined Microscopy & Field samples
microfieldsamples_lyr = microfieldsamples.layers[0]
microfieldsamples_fset = microfieldsamples_lyr.query(where=where_querymf)
microfieldsamples_features = microfieldsamples_fset.features
microfieldsamples_sdf = microfieldsamples_fset.sdf
df = fluorsamples_sdf.sort_values(by=['fluorometry_date'], ascending=False)
# Add cyano status rating to enable easy selection of the worst status for each pond later. This is relevant for cases where there
# is more than one sampling location at a single pond. There are only a few ponds like this, but it won't cause any issues if
# we add the rating for all ponds.
df['cyano_rate'] = 0
df.loc[df.cyano_risk_tier=="Acceptable", "cyano_rate"] = 1
df.loc[df.cyano_risk_tier=="Potential for Concern", "cyano_rate"] = 2
df.loc[df.cyano_risk_tier=="Use Restriction Warranted", "cyano_rate"] = 3
# SELECT ONE MOST RECENT RECORD FOR EACH POND. First, create a function (get_maxdate) to get the most recent record from a group.
def get_maxdate(group): return group[group.fluorometry_date==group.fluorometry_date.max()]
# Group by pond ID and apply get_maxdate to get most recent record(s) for each pond.
recent_pre = df.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
# For ponds with more than one sample location, we'll assign them the worst worst status found at the pond that day.
# First, create a function (get_worststatus) that gets the worst cyano status (based on cyano_rate) from a group.
def get_worststatus(group): return group[group.cyano_rate==group.cyano_rate.max()]
# Then, group by pond ID and apply get_worststatus function.
recent_pre = df.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
# Among the ponds with multiple sample locations, if a pond has the same cyano status at more than one location, it will have
# subsetted more than one record for the pond. This code simply takes the first listed one.
recent_pre.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%d/%Y')
#### FIND THE SECOND MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent records for
# each pond. Apply the above code to this subset, thereby finding the SECOND most recent record for each pond and collecting
# them in a data frame (secondmostrecent).
excludingrecent= df[~df.sampleID.isin(recent_pre.sampleID)]
secmostrecent = excludingrecent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
if secmostrecent.empty:
secmostrecent=excludingrecent
secmostrecent = secmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
if secmostrecent.empty:
secmostrecent=excludingrecent
secmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
# Create a field that lists the date and cyano_status from the second most recent record
secmostrecent["secprior_sample"] = secmostrecent["date_str"] + ": " + secmostrecent["cyano_risk_tier"]
secmostrecent = secmostrecent.loc[:,["CCC_GIS_ID","secprior_sample","sampleID"]]
secmostrecent.rename(columns={"sampleID":"secsampleID"},inplace=True)
#### FIND THE THIRD MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent and 2nd most recent
# records for each pond. Apply the above code to this subset, thereby finding the THIRD most recent record for each pond and
# collecting them in a data frame (thirdmost recent).
excluding2recent = excludingrecent[~excludingrecent.sampleID.isin(secmostrecent.secsampleID)]
thirdmostrecent = excluding2recent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
if thirdmostrecent.empty:
thirdmostrecent=excluding2recent
thirdmostrecent = thirdmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
if thirdmostrecent.empty:
thirdmostrecent=excluding2recent
thirdmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
# Create a field that lists the date and cyano_status from the third most recent record
thirdmostrecent["thirdprior_sample"] = thirdmostrecent["date_str"] + ": " + thirdmostrecent["cyano_risk_tier"]
thirdmostrecent = thirdmostrecent.loc[:,["CCC_GIS_ID","thirdprior_sample","sampleID"]]
thirdmostrecent.rename(columns={"sampleID":"thirdsampleID"},inplace=True)
#### CREATE PRIOR ACTIVITY FIELD
# Merge data frames recent_pre, secmostrecent, and thirdmostrecent together into "recent." Create field called prior_activity that
# lists sample dates and statuses from the second and third most recent sampling dates for a pond.
recent = pd.merge(recent_pre,secmostrecent,on="CCC_GIS_ID",how="outer"); recent = pd.merge(recent,thirdmostrecent,on="CCC_GIS_ID",how="outer")
recent["prior_activity"] = recent["secprior_sample"] + ", " + recent["thirdprior_sample"].fillna("")
recent["prior_activity"] = recent["prior_activity"].str.strip().str.rstrip(',') # <-- remove hanging comma and space at end, in cases where there is only 2nd most recent
recent = recent.sort_values(by="fluorometry_date", ascending=False)
overlap_rows = pd.merge(left = cyanoponds_sdf, right = recent, how = 'inner', on='CCC_GIS_ID' )
cyanopond_features = cyanoponds_fset.features
fluorsamples_updates = fluorsamples_fset.features
fluorsamples_updates.reverse()
def update(cyanopond, fluorometry):
for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
try:
cyanopond_feature = [f for f in cyanopond_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
fluorsample_feature = [f for f in fluorsamples_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
cyanopond_feature.attributes['Sample_Date'] = fluorsample_feature.attributes['fluorometry_date']
cyanopond_feature.attributes['Cyano_Status']= fluorsample_feature.attributes['cyano_risk_tier']
#cyanopond_feature.attributes['Recent_Activity']= I need this to populate with the 'prior_activity' column that we wrote to the 'recent' data frame
cyanopond_feature.attributes['Town_action']= fluorsample_feature.attributes['town_actions']
cyanopond_feature.attrbutes['Notes']=fluorsample_feature.attributes['town_action_notes']
#cyanopond_feature.attributes['Dominance']= eventually will need these three fields populated by micrcofieldsamples OR use UniqueID to populate in the pop-up
#cyanopond_feature.attributes['Scum']=
#cyanopond_feature.attirubtes['Water_Temp']=
cyanoponds_lyr.edit_features(updates=[cyanopond_feature])
print(f"Updated {cyanopond_feature.attributes['CCC_GIS_ID']} cyano status to {cyanopond_feature.attributes['Cyano_Status']}",flush=True)
except Exception as e:
# Print exception details for debugging.
print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}")
update(cyanoponds_features, fluorsamples_updates)
update_CyanoData()
Thanks Joseph! I tried this but now I'm getting this error:
Error updating feature with CCC_GIS_ID OR-176: Circular reference detected Error updating feature with CCC_GIS_ID OR-153: Circular reference detected Error updating feature with CCC_GIS_ID BA-1171: Circular reference detected
I'm not sure if it has to do with cyanopond_features vs cyanopond_feature vs cyanoponds_features. In your article I noticed that the def update used a different name ('basins') that didn't appear to be used anywhere else. And you use catch_basin_features and catch_basins_features, so I tried to mimic this in my code. Just an idea, I'm not sure exaclty what that error means.
Hi Samantha, can you remove the try-except so it just errors out with a traceback, so we can see what line it's tripping up on?
Regarding the other post, I noticed that error. I'm reaching to out Esri to ask for permission to edit it (seems like you can't edit old posts).
Oh okay, good to know about that post!
Here is the full error:
ValueError Traceback (most recent call last) /tmp/ipykernel_19/2573390661.py in <cell line: 125>() 123 update(cyanoponds_features, overlap_rows, cyanoponds_lyr) 124 --> 125 update_CyanoData() /tmp/ipykernel_19/2573390661.py in update_CyanoData() 121 # Print exception details for debugging. 122 print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}") --> 123 update(cyanoponds_features, overlap_rows, cyanoponds_lyr) 124 125 update_CyanoData() /tmp/ipykernel_19/2573390661.py in update(cyanoponds_features, overlap_rows, cyanoponds_lyr) 114 #cyanopond_feature.attributes['Scum']= 115 #cyanopond_feature.attributes['Water_Temp']= --> 116 resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature]) 117 print (f"Attempted to update Cyano_Status for {CCC_GIS_ID} to {row['cyano_risk_tier']}. Response: {resp}") 118 #print(f"Updated {cyanopond_feature.attributes['CCC_GIS_ID']} cyano status to {cyanopond_feature.attributes['Cyano_Status']}",flush=True) /opt/conda/lib/python3.9/site-packages/arcgis/features/layer.py in edit_features(self, adds, updates, deletes, gdb_version, use_global_ids, rollback_on_failure, return_edit_moment, attachments, true_curve_client, session_id, use_previous_moment, datum_transformation, future) 3397 ) 3398 elif isinstance(updates[0], Feature): -> 3399 params["updates"] = json.dumps( 3400 [f.as_dict for f in updates], default=_date_handler 3401 ) /opt/conda/lib/python3.9/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw) 232 if cls is None: 233 cls = JSONEncoder --> 234 return cls( 235 skipkeys=skipkeys, ensure_ascii=ensure_ascii, 236 check_circular=check_circular, allow_nan=allow_nan, indent=indent, /opt/conda/lib/python3.9/json/encoder.py in encode(self, o) 197 # exceptions aren't as detailed. The list call should be roughly 198 # equivalent to the PySequence_Fast that ''.join() would do. --> 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): 201 chunks = list(chunks) /opt/conda/lib/python3.9/json/encoder.py in iterencode(self, o, _one_shot) 255 self.key_separator, self.item_separator, self.sort_keys, 256 self.skipkeys, _one_shot) --> 257 return _iterencode(o, 0) 258 259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, ValueError: Circular reference detected
Try this:
import copy
def update_cyano_status(cyanoponds_features, overlap_rows, cyanoponds_lyr):
for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
cyanopond_feature = [f for f in cyanoponds_features if f.attributes['CCC_GIS_ID'] == CCC_GIS_ID][0]
row = overlap_rows[overlap_rows['CCC_GIS_ID'] == CCC_GIS_ID].iloc[0]
cyanopond_feature_copy = copy.deepcopy(cyanopond_feature)
cyanopond_feature_copy.attributes['Cyano_Status'] = row['cyano_risk_tier']
resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature_copy])
print(f"Attempted to update Cyano_Status for {CCC_GIS_ID} to {row['cyano_risk_tier']}. Response: {resp}")
I tried that and I'm still getting the circular reference detected error. I also tried a few suggestions from ChatGPT but I continue to get this same message.
I fixed the circular error by replacing blank values with NA in the below script:
def update(cyanoponds_features, overlap_rows, cyanoponds_lyr):
overlap_rows['cyano_risk_tier'] = overlap_rows['cyano_risk_tier'].fillna('No Updated Information')
overlap_rows['town_actions'] = overlap_rows['town_actions'].fillna('<NA>')
overlap_rows['town_action_notes'] = overlap_rows['town_action_notes'].fillna('<NA>')
overlap_rows['prior_activity'] = overlap_rows['prior_activity'].fillna('<NA>')
for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
try:
cyanopond_feature = [f for f in cyanopond_features if f.attributes['CCC_GIS_ID'] == CCC_GIS_ID][0]
row = overlap_rows[overlap_rows['CCC_GIS_ID'] == CCC_GIS_ID].iloc[0]
# Sanitize the content before updating the feature attributes
notes_content = html.escape(row['town_action_notes']) if row['town_action_notes'] else '<NA>'
prior_content = html.escape(row['prior_activity']) if row['prior_activity'] else '<NA>'
# Create a copy of attributes only to avoid circular reference issues
updated_attributes = {
'Sample_Date': row['fluorometry_date'],
'Cyano_Status': row['cyano_risk_tier'],
'Recent_Activity': prior_content,
'Town_action': row['town_actions'],
'Notes': notes_content,
# Add other fields as necessary
}
cyanopond_feature_copy = copy.deepcopy(cyanopond_feature)
cyanopond_feature_copy.attributes.update(updated_attributes)
resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature_copy])
print (f"Attempted to update Cyano_Status for {CCC_GIS_ID} to {row['cyano_risk_tier']}. Response: {resp}")
print (resp)
except Exception as e:
print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}")
update(cyanoponds_features, overlap_rows, cyanoponds_lyr)
Now I just need to add in the 30 days code and the microscopy/field data fields and it appears like this may finally be working completely!
For Issue #2: try this to get a list of CCC_GIS_ID values where there has not been a sample in 30 days:
def query_for_no_inspection_in_thirty_days(layer):
threshold_date = datetime.now() - timedelta(days=30)
query_result = layer.query(where=f"fluorometry_date < '{threshold_date}'", out_fields='CCC_GIS_ID')
ccc_gis_ids = [record.attributes['CCC_GIS_ID'] for record in query_result.features]
return ccc_gis_ids
no_fluorometry_in_30 = query_for_no_inspection_in_thirty_days(fluorsamples_lyr)
print(no_fluorometry_in_30)
Then you can loop over those CCC_GIS_ID values and update the ponds:
for CCC_GIS_ID in no_fluorometry_in_30:
cyanopond_feature = [f for f in cyanoponds_features if f.attributes['CCC_GIS_ID'] == CCC_GIS_ID]
cyanopond_feature.attributes['Cyano_Status'] = 'No updated information'
resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature])
Then do the same for the microscopy. Let me know how that works.