Select to view content in your preferred language

Updating an attribute table of a Feature Layer using Python Notebook and elements from 2 Surveys

1420
16
05-03-2024 06:27 AM
SamanthaAPCC
Emerging Contributor

Let me preface by saying that my experience with python is very limited and I am using elements from 2 codes to try to make this work.  I am using AGOL Python Notebook to write to a hosted feature layer using attributes from two Survey123 surveys.  Eventually I will need to schedule this code to run daily.  The code seems to be working and is printing my final message, but it is not writing to the attribute table of my cyanoponds layer.  Am I not using the correct python language?  I am very novice at python but this is our only option for the type of updates we need.  I am taking data from a survey and based on a unique ID (CCC_GIS_ID) I am looking for the most recent sample as well as the one with the highest cyano_status ranking.  It also appears to be skipping CCC_GIS_IDs when there is more than 1 sample at the pond location, which means I'm not incorporating lines 46-57 correctly.  I will also need to add a component that changes the status to "No Updated Information" if the most recent sample is older than 30 days, which I haven't quite figured out how to add in yet.

Also, when this code is running daily it will need to overwrite the table with the new information.

I took elements from this article in order to come up with the basis of this code.  The rest I wrote or used from a code that was handed down to me for this project back when it used to be run on a local machine.  Any help is greatly appreciated!!

from arcgis.gis import GIS
from arcgis.features import FeatureLayer
import pandas as pd
from arcgis.features import FeatureCollection
from datetime import datetime, timedelta

gis = GIS("home")

def update_CyanoData():
    
    one_day = datetime.today() - timedelta(days=1)
    string_day = one_day.strftime('%Y-%m-%d')
    where_query = f"fluorometry_date >= DATE '{string_day}'"
    where_querymf = f"microscopy_date >= DATE '{string_day}'"
    days_valid = 30 #How do I incorporate this so that the Cyano_Status field changes to 'No Updated Information' if all samples for that
    # pond are older than 30 days?
    
    cyanoponds = gis.content.get('a13f1b2e80ff4580a27620fc9d47f757')
    cyanoponds_lyr = cyanoponds.layers[0]
    cyanoponds_fset = cyanoponds_lyr.query()                              
    cyanoponds_features = cyanoponds_fset.features
    cyanoponds_sdf = cyanoponds_fset.sdf
       
    fluorsamples = gis.content.get('db342bf34b1742c0abb96d6ebff2f396') #Public facing View of the Fluorometry survey, final_review = yes
    fluorsamples_lyr = fluorsamples.layers[0]
    fluorsamples_fset = fluorsamples_lyr.query(where=where_query) 
    fluorsamples_features = fluorsamples_fset.features
    fluorsamples_sdf = fluorsamples_fset.sdf 
    
    #I will maybe need to use this survey to populate dominant genus, scum notes & pond temperatures if I can't do that with arcade and UniqueIDs in pop-ups
    microfieldsamples = gis.content.get('dc204479b5bf4f34a2efed96d0b2a340') #View of Joined Microscopy & Field samples
    microfieldsamples_lyr = microfieldsamples.layers[0]
    microfieldsamples_fset = microfieldsamples_lyr.query(where=where_querymf)
    microfieldsamples_features = microfieldsamples_fset.features
    microfieldsamples_sdf = microfieldsamples_fset.sdf    
    
    df = fluorsamples_sdf.sort_values(by=['fluorometry_date'], ascending=False)
    # Add cyano status rating to enable easy selection of the worst status for each pond later. This is relevant for cases where there
    # is more than one sampling location at a single pond. There are only a few ponds like this, but it won't cause any issues if
    # we add the rating for all ponds.
    df['cyano_rate'] = 0
    df.loc[df.cyano_risk_tier=="Acceptable", "cyano_rate"] = 1
    df.loc[df.cyano_risk_tier=="Potential for Concern", "cyano_rate"] = 2
    df.loc[df.cyano_risk_tier=="Use Restriction Warranted", "cyano_rate"] = 3
    
    # SELECT ONE MOST RECENT RECORD FOR EACH POND. First, create a function (get_maxdate) to get the most recent record from a group.
    def get_maxdate(group): return group[group.fluorometry_date==group.fluorometry_date.max()]
    # Group by pond ID and apply get_maxdate to get most recent record(s) for each pond.
    recent_pre = df.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    # For ponds with more than one sample location, we'll assign them the worst worst status found at the pond that day.
    # First, create a function (get_worststatus) that gets the worst cyano status (based on cyano_rate) from a group.
    def get_worststatus(group): return group[group.cyano_rate==group.cyano_rate.max()]
    # Then, group by pond ID and apply get_worststatus function.
    recent_pre = df.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    # Among the ponds with multiple sample locations, if a pond has the same cyano status at more than one location, it will have
    # subsetted more than one record for the pond. This code simply takes the first listed one.
    recent_pre.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    
    df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%d/%Y')
    
    #### FIND THE SECOND MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent records for 
    # each pond. Apply the above code to this subset, thereby finding the SECOND most recent record for each pond and collecting
    # them in a data frame (secondmostrecent).
    excludingrecent= df[~df.sampleID.isin(recent_pre.sampleID)]
    secmostrecent  = excludingrecent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    if secmostrecent.empty:
        secmostrecent=excludingrecent
    secmostrecent  = secmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    if secmostrecent.empty:
        secmostrecent=excludingrecent
    secmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    # Create a field that lists the date and cyano_status from the second most recent record
    secmostrecent["secprior_sample"] = secmostrecent["date_str"] + ": " + secmostrecent["cyano_risk_tier"]
    secmostrecent = secmostrecent.loc[:,["CCC_GIS_ID","secprior_sample","sampleID"]]
    secmostrecent.rename(columns={"sampleID":"secsampleID"},inplace=True)
    
    #### FIND THE THIRD MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent and 2nd most recent
    # records for each pond. Apply the above code to this subset, thereby finding the THIRD most recent record for each pond and
    # collecting them in a data frame (thirdmost recent).
    excluding2recent = excludingrecent[~excludingrecent.sampleID.isin(secmostrecent.secsampleID)]
    thirdmostrecent  = excluding2recent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    if thirdmostrecent.empty:
        thirdmostrecent=excluding2recent
    thirdmostrecent  = thirdmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    if thirdmostrecent.empty:
        thirdmostrecent=excluding2recent
    thirdmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    # Create a field that lists the date and cyano_status from the third most recent record
    thirdmostrecent["thirdprior_sample"] = thirdmostrecent["date_str"] + ": " + thirdmostrecent["cyano_risk_tier"]
    thirdmostrecent = thirdmostrecent.loc[:,["CCC_GIS_ID","thirdprior_sample","sampleID"]]
    thirdmostrecent.rename(columns={"sampleID":"thirdsampleID"},inplace=True)
    
    #### CREATE PRIOR ACTIVITY FIELD
    # Merge data frames recent_pre, secmostrecent, and thirdmostrecent together into "recent." Create field called prior_activity that
    # lists sample dates and statuses from the second and third most recent sampling dates for a pond.
    recent = pd.merge(recent_pre,secmostrecent,on="CCC_GIS_ID",how="outer"); recent = pd.merge(recent,thirdmostrecent,on="CCC_GIS_ID",how="outer")
    recent["prior_activity"] = recent["secprior_sample"] + ", " + recent["thirdprior_sample"].fillna("")
    recent["prior_activity"] = recent["prior_activity"].str.strip().str.rstrip(',') # <-- remove hanging comma and space at end, in cases where there is only 2nd most recent
    recent = recent.sort_values(by="fluorometry_date", ascending=False)
        
    overlap_rows = pd.merge(left = cyanoponds_sdf, right = recent, how = 'inner', on='CCC_GIS_ID' )
    cyanopond_features = cyanoponds_fset.features
    fluorsamples_updates = fluorsamples_fset.features
    fluorsamples_updates.reverse()
     
    def update(cyanopond, fluorometry):
        for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
            try:
                cyanopond_feature = [f for f in cyanopond_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
                fluorsample_feature = [f for f in fluorsamples_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
                cyanopond_feature.attributes['Sample_Date'] = fluorsample_feature.attributes['fluorometry_date']
                cyanopond_feature.attributes['Cyano_Status']= fluorsample_feature.attributes['cyano_risk_tier']
                #cyanopond_feature.attributes['Recent_Activity']= I need this to populate with the 'prior_activity' column that we wrote to the 'recent' data frame 
                cyanopond_feature.attributes['Town_action']= fluorsample_feature.attributes['town_actions']
                cyanopond_feature.attrbutes['Notes']=fluorsample_feature.attributes['town_action_notes']
                #cyanopond_feature.attributes['Dominance']= eventually will need these three fields populated by micrcofieldsamples OR use UniqueID to populate in the pop-up
                #cyanopond_feature.attributes['Scum']=
                #cyanopond_feature.attirubtes['Water_Temp']= 
                cyanoponds_lyr.edit_features(updates=[cyanopond_feature])
                print(f"Updated {cyanopond_feature.attributes['CCC_GIS_ID']} cyano status to {cyanopond_feature.attributes['Cyano_Status']}",flush=True)
            except Exception as e:
                # Print exception details for debugging.
                print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}")
    update(cyanoponds_features, fluorsamples_updates)

update_CyanoData()

 

0 Kudos
16 Replies
CodyPatterson
Frequent Contributor

Hey @SamanthaAPCC 

Here are some observations I made:

There are a few typos, cyanopond_feature.attrbutes['Notes'] is what is shown, should be cyanopond_feature.attributes['Notes'], there is also an attribute named cyanopond_feature.attirbutes['Water_Temp'] that should be cyanopond_feature.attributes['Water_Temp'] if you are going to use that one

I think you wanted to use Recent_Activity but you aren't writing anything to it I think, you would need to assign the values of prior_activity to the Recent_Activity attribute of the cyanopond_feature

secmostrecent and thirdmostrecent could be empty, I'd try printing them to check to see what they contain

I'd ensure that flurosamples_fset has all the features, I saw that you used fluorsamples_updates as well, I'd print both and ensure you're using the correct one.

I'd ensure that CCC_GIS_ID exists in both cyanoponds and fluorsamples

You're passing cyanoponds_features and fluorsamples_updates into your update function, but those variables aren't being used.

This is only some fixes for this one, I'm still looking into the other questions you had.

I'll continue looking through and see what I can find, but you're on the way to success surely!

Cody

0 Kudos
SamanthaAPCC
Emerging Contributor

Thanks @CodyPatterson, this is all super helpful!

I fixed those spelling errors, thank you!  I haven't quite figured out where I'm going to pull in the related data from the other Survey (the microfieldsample view layer), but when I'm ready to tackle that this part was good to fix.

To pull in the data that I wrote to the recent["prior_activity"] column, how do I call that in the  def update?  Since it isn't a part of fluorsample_features I wasn't sure how to access that column. 

CCC_GIS_ID does exist in both cyanoponds and fluorsamples.

I was also confused by the cyanopond_features & fluorsamples_updates... I followed the example from here: https://community.esri.com/t5/arcgis-online-blog/easy-how-to-symbology-using-related-records/ba-p/89...  but I wasn't sure why they call those in their update function.

The edit_features should work to edit a hosted feature layer and write to an attribute table correct?  It's odd because I will get the correct printed message, but those corresponding fields in my attribute table remain empty even for the correctly printed ponds.

I printed fluorsamples_fset and fluorsamples_updates and realized that I am missing that pond that has the 2 sampling locations I am trying to test, which means my get_maxdate & get_worststatus are hopefully working correctly, there is just nothing there to test.  I think I need to review the top language about time/dates.  I again pulled that section from that article, but it appears to be skipping that pond because it was collected in the past, and the other test ponds have future dates.  In reality, all ponds will have been sampled for (today) or within the past 30 days, so I need to make sure that language is working correctly, and also change the dates on my test data to days within the last 30 days.

0 Kudos
JohnEvans6
Regular Contributor

What response are you getting when you run the update? can you add in:

 

resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature])
Print (resp)

 

It's sending out an update and getting a response back successfully, but the response is probably saying "hey i cant find this record so nothing was updated" or something; its not throwing an actual exception. I've wrestled with it before and how I fixed it was to create a template feature based on the fields I'm updating (In my example I'm only making things Active or Inactive), updated the attributes I want to send to with the objectID to update, and then fired them off. Row is just the row of data in my dataframe.

 

features_to_update = []

def create_update_feature(row, objid) :    
    
    # Establish Feature Template
    template_feature = {"attributes": {"OBJECTID": '', "active": ''}}

    # Copy Template to a new feature var
    update_feature = copy.deepcopy(template_feature)
    
    # assign the updated values
    update_feature['attributes']["OBJECTID"] = int(objid)
    update_feature['attributes']["active"]=row["active"]

    # Return the update feature
    return update_feature

update_feature = create_update_feature(row, update_objid)
features_to_update.append(update_feature)

 

Then you would do something like

 

cyanoponds_lyr.edit_features(updates=[features_to_update])

 

 

This helped me out a lot:

https://developers.arcgis.com/python/guide/editing-features/

0 Kudos
SamanthaAPCC
Emerging Contributor

Thanks @JohnEvans6 !  I haven't tried your template feature text yet, but this is the message I get with the response text:

 

Updated OR-176 cyano status to Acceptable
{'addResults': [], 'updateResults': [{'objectId': 29, 'uniqueId': 29, 'globalId': None, 'success': False, 'error': {'code': 1000, 'description': 'Operand type clash: bigint is incompatible with date'}}], 'deleteResults': []}
Updated OR-153 cyano status to Use Restriction Warranted
{'addResults': [], 'updateResults': [{'objectId': 57, 'uniqueId': 57, 'globalId': None, 'success': False, 'error': {'code': 1000, 'description': 'Operand type clash: bigint is incompatible with date'}}], 'deleteResults': []}
Updated BA-1171 cyano status to Acceptable
{'addResults': [], 'updateResults': [{'objectId': 126, 'uniqueId': 126, 'globalId': None, 'success': False, 'error': {'code': 1000, 'description': 'Operand type clash: bigint is incompatible with date'}}], 'deleteResults': []}

I'm not sure what the error means... something is going on with my date fields.  In the Feature Layer that I'm trying to write to (cyanoponds) the Sample_Date is a DateOnly field... could that be the issue? 

0 Kudos
JohnEvans6
Regular Contributor

Sure seems that way; you're awfully close. Quickest way to check would be to comment out the date and see if success: true comes back and your updates show.

# cyanopond_feature.attributes['Sample_Date'] = fluorsample_feature.attributes['fluorometry_date']

From there you can figure out what's going on; your pandas is likely not formatted in a way AGOL wants it to be.

https://doc.arcgis.com/en/arcgis-online/manage-data/work-with-date-fields.htm

This link has a table of what your timestamp should look like. You'll probably have to mess with your ['Sample_Date'] column a little bit to get it right.

0 Kudos
SamanthaAPCC
Emerging Contributor

@JohnEvans6  that was it!  Commenting out that line fixed the issue of my layer not being updated!  Now to figure out how to fix the date... I will mess with my column and check out that link.  And next to add in the other features I need this code to do for me... but this was a huge win!

0 Kudos
JohnEvans6
Regular Contributor

Awesome yeah after a quick glance its probably something like changing

df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%d/%Y')

to

df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%dd/%YYYY')
0 Kudos
SamanthaAPCC
Emerging Contributor

@JohnEvans6 I fixed the date issue by changing my Sample_Date field to a traditional 'date' instead of a 'date only'.  So now my Date, Cyano Status, and Town Action fields are being written to in my cyanoponds layer!

The issues I'm still having are:

  1. It is not connecting the 'cyano_rate' to the correct 'Cyano_Status'.  If I print my 'recent' data frame (or my overlap_rows), I can see the highest cyano rate being selected, but it does not appear to be using that when I get down to my update function.  So for example if a pond has 2 sample sites and one site is 'Acceptable' and the other is 'Use Restriction Warranted' it is writing 'Acceptable' instead of the higher ranking 'Use Restriction Warranted'.
  2. I need to write somewhere that the Cyano_Status will be 'No updated Information' if there is not a sample within 30 days.  Hopefully this isn't too difficult to plug in somewhere, but I'm having trouble writing the code to make this work.
  3. I need to try to write the prior_activity field.  Again, I can see the information pulling in correctly to my overlap_rows, I'm just not sure how to connect that to writing to the cyanoponds layer in my Update.
  4. I would like to pull in some other information from another survey (microfieldsamples) to write in my remaining fields (I have them commented out at the bottom of the Update).  I could connect these via what we call a UniqueID that is the same across all of my surveys, so if the selected row is being pulled from fluorsamples - we can find the corresponding information for Dominance, Scum & Water Temp via that UniqueID.  Again, I'm not sure how to do this in the Update or if I need to write some additional code before the Update.

I'm also tagging @JosephRhodes2 since he has been following along, and this thread contains my most recent updates to the code.  Below is my most updated version.

def update_CyanoData():
    
    thirty_days = datetime.today() - timedelta(days=30)
    string_day = thirty_days.strftime('%Y-%m-%d')
    where_query = f"fluorometry_date >= DATE '{string_day}'"
    where_querymf = f"microscopy_date >= DATE '{string_day}'"
    days_valid = 30 #How do I incorporate this so that the Cyano_Status field changes to 'No Updated Information' if all samples for that
    # pond are older than 30 days?
    
    cyanoponds = gis.content.get('a13f1b2e80ff4580a27620fc9d47f757')
    cyanoponds_lyr = cyanoponds.layers[0]
    cyanoponds_fset = cyanoponds_lyr.query()                              
    cyanoponds_features = cyanoponds_fset.features
    cyanoponds_sdf = cyanoponds_fset.sdf
       
    fluorsamples = gis.content.get('db342bf34b1742c0abb96d6ebff2f396') #Public facing View of the Fluorometry survey, final_review = yes
    fluorsamples_lyr = fluorsamples.layers[0]
    fluorsamples_fset = fluorsamples_lyr.query(where=where_query) 
    fluorsamples_features = fluorsamples_fset.features
    fluorsamples_sdf = fluorsamples_fset.sdf 
        
    #I will maybe need to use this survey to populate dominant genus, scum notes & pond temperatures if I can't do that with arcade and UniqueIDs in pop-ups
    microfieldsamples = gis.content.get('dc204479b5bf4f34a2efed96d0b2a340') #View of Joined Microscopy & Field samples
    microfieldsamples_lyr = microfieldsamples.layers[0]
    microfieldsamples_fset = microfieldsamples_lyr.query(where=where_querymf)
    microfieldsamples_features = microfieldsamples_fset.features
    microfieldsamples_sdf = microfieldsamples_fset.sdf    
    
    df = fluorsamples_sdf.sort_values(by=['fluorometry_date'], ascending=False)
    # Add cyano status rating to enable easy selection of the worst status for each pond later. This is relevant for cases where there
    # is more than one sampling location at a single pond. There are only a few ponds like this, but it won't cause any issues if
    # we add the rating for all ponds.
    df['cyano_rate'] = 0
    df.loc[df.cyano_risk_tier=="Acceptable", "cyano_rate"] = 1
    df.loc[df.cyano_risk_tier=="Potential for Concern", "cyano_rate"] = 2
    df.loc[df.cyano_risk_tier=="Use Restriction Warranted", "cyano_rate"] = 3
        
    # SELECT ONE MOST RECENT RECORD FOR EACH POND. First, create a function (get_maxdate) to get the most recent record from a group.
    def get_maxdate(group): return group[group.fluorometry_date==group.fluorometry_date.max()]
    # Group by pond ID and apply get_maxdate to get most recent record(s) for each pond.
    recent_pre = df.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    # For ponds with more than one sample location, we'll assign them the worst worst status found at the pond that day.
    # First, create a function (get_worststatus) that gets the worst cyano status (based on cyano_rate) from a group.
    def get_worststatus(group): return group[group.cyano_rate==group.cyano_rate.max()]
    # Then, group by pond ID and apply get_worststatus function.
    recent_pre = df.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    # Among the ponds with multiple sample locations, if a pond has the same cyano status at more than one location, it will have
    # subsetted more than one record for the pond. This code simply takes the first listed one.
    recent_pre.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    #print (recent_pre)
    
    df["date_str"] = pd.to_datetime(df['fluorometry_date']).dt.strftime('%m/%d/%Y')
    
    #### FIND THE SECOND MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent records for 
    # each pond. Apply the above code to this subset, thereby finding the SECOND most recent record for each pond and collecting
    # them in a data frame (secondmostrecent).
    excludingrecent= df[~df.sampleID.isin(recent_pre.sampleID)]
    secmostrecent  = excludingrecent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    if secmostrecent.empty:
        secmostrecent=excludingrecent
    secmostrecent  = secmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    if secmostrecent.empty:
        secmostrecent=excludingrecent
    secmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    # Create a field that lists the date and cyano_status from the second most recent record
    secmostrecent["secprior_sample"] = secmostrecent["date_str"] + ": " + secmostrecent["cyano_risk_tier"]
    secmostrecent = secmostrecent.loc[:,["CCC_GIS_ID","secprior_sample","sampleID"]]
    secmostrecent.rename(columns={"sampleID":"secsampleID"},inplace=True)
    
    #### FIND THE THIRD MOST RECENT RECORD FOR EACH POND. Create a new data frame that excludes the most recent and 2nd most recent
    # records for each pond. Apply the above code to this subset, thereby finding the THIRD most recent record for each pond and
    # collecting them in a data frame (thirdmost recent).
    excluding2recent = excludingrecent[~excludingrecent.sampleID.isin(secmostrecent.secsampleID)]
    thirdmostrecent  = excluding2recent.groupby("CCC_GIS_ID").apply(get_maxdate).reset_index(drop=True)
    if thirdmostrecent.empty:
        thirdmostrecent=excluding2recent
    thirdmostrecent  = thirdmostrecent.groupby("CCC_GIS_ID").apply(get_worststatus).reset_index(drop=True)
    if thirdmostrecent.empty:
        thirdmostrecent=excluding2recent
    thirdmostrecent.drop_duplicates(subset = ['CCC_GIS_ID'], keep = 'first', inplace = True)
    # Create a field that lists the date and cyano_status from the third most recent record
    thirdmostrecent["thirdprior_sample"] = thirdmostrecent["date_str"] + ": " + thirdmostrecent["cyano_risk_tier"]
    thirdmostrecent = thirdmostrecent.loc[:,["CCC_GIS_ID","thirdprior_sample","sampleID"]]
    thirdmostrecent.rename(columns={"sampleID":"thirdsampleID"},inplace=True)
    
    #### CREATE PRIOR ACTIVITY FIELD
    # Merge data frames recent_pre, secmostrecent, and thirdmostrecent together into "recent." Create field called prior_activity that
    # lists sample dates and statuses from the second and third most recent sampling dates for a pond.
    recent = pd.merge(recent_pre,secmostrecent,on="CCC_GIS_ID",how="outer"); recent = pd.merge(recent,thirdmostrecent,on="CCC_GIS_ID",how="outer")
    recent["prior_activity"] = recent["secprior_sample"] + ", " + recent["thirdprior_sample"].fillna("")
    recent["prior_activity"] = recent["prior_activity"].str.strip().str.rstrip(',') # <-- remove hanging comma and space at end, in cases where there is only 2nd most recent
    recent = recent.sort_values(by="fluorometry_date", ascending=False)
    print (recent)
    
    overlap_rows = pd.merge(left = cyanoponds_sdf, right = recent, how = 'inner', on='CCC_GIS_ID' )
    print (overlap_rows)
    cyanopond_features = cyanoponds_fset.features
    fluorsamples_updates = fluorsamples_fset.features
    fluorsamples_updates.reverse()
         
    def update(cyanopond, fluorometry):
        for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
            try:
                cyanopond_feature = [f for f in cyanopond_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
                fluorsample_feature = [f for f in fluorsamples_features if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0]
                #recent_feature = [f for f in recent if f.attributes['CCC_GIS_ID']==CCC_GIS_ID][0] how do I use this to call for the prior_activity below
                cyanopond_feature.attributes['Sample_Date'] = fluorsample_feature.attributes['fluorometry_date']
                cyanopond_feature.attributes['Cyano_Status']= fluorsample_feature.attributes['cyano_risk_tier']
                #cyanopond_feature.attributes['Recent_Activity']= recent_feature.attributes['prior_activity'] still need to fix this
                cyanopond_feature.attributes['Town_action']= fluorsample_feature.attributes['town_actions']
                cyanopond_feature.attributes['Notes']=fluorsample_feature.attributes['town_action_notes']
                #cyanopond_feature.attributes['Dominance']= eventually will need these three fields populated by micrcofieldsamples OR use UniqueID to populate in the pop-up
                #cyanopond_feature.attributes['Scum']=
                #cyanopond_feature.attributes['Water_Temp']= 
                resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature])
                print(f"Updated {cyanopond_feature.attributes['CCC_GIS_ID']} cyano status to {cyanopond_feature.attributes['Cyano_Status']}",flush=True)
                print (resp)
            except Exception as e:
                # Print exception details for debugging.
                print(f"Error updating feature with CCC_GIS_ID {CCC_GIS_ID}: {e}")
    update(cyanoponds_features, fluorsamples_updates)
    
update_CyanoData()

  

0 Kudos
JosephRhodes2
Frequent Contributor

Hi Samantha, regarding Issues #1 and #3: It's hard to tell for sure without access to your data, but it looks to me like you're writing a value from the fluorsample layer to the cyanoponds layer to populate Cyano_Status, instead of pulling the value from the dataframe you've built. Same for prior_activity.

This might put you on the right track (let me know if I misunderstood something):

 

def update_cyano_status(cyanoponds_features, overlap_rows, cyanoponds_lyr):
    for CCC_GIS_ID in overlap_rows['CCC_GIS_ID']:
        cyanopond_feature = [f for f in cyanoponds_features if f.attributes['CCC_GIS_ID'] == CCC_GIS_ID][0]
        row = overlap_rows[overlap_rows['CCC_GIS_ID'] == CCC_GIS_ID].iloc[0] # get the matching row from the dataframe
        cyanopond_feature.attributes['Cyano_Status'] = row['cyano_risk_tier'] # add the matching status value from the dataframe to the update dictionary
        resp = cyanoponds_lyr.edit_features(updates=[cyanopond_feature]) # call the update
        print(f"Attempted to update Cyano_Status for {CCC_GIS_ID} to {row['cyano_risk_tier']}. Response: {resp}")

 

0 Kudos