Select to view content in your preferred language

Joining a GDB table to a hosted feature service and updating 30,000 rows

77
1
Monday
marksm_macomb
Frequent Contributor

Hi all,

I'm a novice python user, so I appreciate any advice. I have a hosted feature layer with approx 30,000 features that I need to update 13 fields in by joining a table that exists in a file geodatabase. I am using the workflow provided in this community post : Solved: Calculating fields in hosted feature layers VERY S... - Esri Community

Here is the code block I have (TaxID_PIN is my join key, the rest are the fields I'm trying to update), I am running this in a python Notebook within ArcGIS Pro:

from arcgis import GIS
import arcpy

fc = r"Path\to\GDB\Table"

agol = GIS("home")

item = agol.content.get("HostedFeatureService_ItemID")

lyr = [lyr for lyr in item.layers if lyr.properties.name == "Water Meter"][0]

join_dict = {}

with arcpy.da.SearchCursor(fc, ["TaxID_PIN", "HistoricalGooseneckMaterial", "HistoricalMaintoCurbStop", "PublicMaterialSource", "HistoricalCurbStoptoHome", "PrivateMaterialSource", "HistoricalInteriorMaterial", "InterialMaterialSource", "FieldGooseneckMaterial", "FieldMaintoCurbStop", "FieldCurbStoptoHouse", "FieldInterior", "DateVerified", "MethodofVerification"]) as cursor:
    for row in cursor:
        join_dict[row[0]] = [row[1], row[2], row[3], row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13]]

query = lyr.query(where='OBJECTID<5000')

edit = [f for f in query.features]
for f in edit:
    join_key = f.attributes["TaxID_PIN"]

    if join_key in join_dict:
        f.attributes["HistoricalGooseneckMaterial"] = join_dict[join_key][0]
        f.attributes["HistoricalMaintoCurbStop"] = join_dict[join_key][1]
        f.attributes["PublicMaterialSource"] = join_dict[join_key][2]
        f.attributes["HistoricalCurbStoptoHome"] = join_dict[join_key][3]
        f.attributes["PrivateMaterialSource"] = join_dict[join_key][4]
        f.attributes["HistoricalInteriorMaterial"] = join_dict[join_key][5]
        f.attributes["InterialMaterialSource"] = join_dict[join_key][6]
        f.attributes["FieldGooseneckMaterial"] = join_dict[join_key][7]
        f.attributes["FieldMaintoCurbStop"] = join_dict[join_key][8]
        f.attributes["FieldCurbStoptoHouse"] = join_dict[join_key][9]
        f.attributes["FieldInterior"] = join_dict[join_key][10]
        f.attributes["DateVerified"] = join_dict[join_key][11]
        f.attributes["MethodofVerification"] = join_dict[join_key][12]

lyr.edit_features(updates=edit)

  

So I first ran it without the (where='OBJECTID<5000') query parameter and it threw a 413 error, so I figured it was too many rows to process and decided to try and do it in batches with the limited query. 

Once I put the limited query in, it didn't return a true "Error", but it did give me the message "Parameters are not valid for edit_features"

This is where I am stuck. Thank you!

0 Kudos
1 Reply
HaydenWelch
MVP Regular Contributor

You seem to have the general workflow down. The structure of the arcgis module can be difficult to comprehend sometimes. I broke your script out into a couple different functions to try and make the general flow easier to debug:

from typing import Any
import arcgis
import arcpy

def gather_updates(in_table: str, fields: list[str]) -> dict[str, dict[str, Any]]:
    """Map table values to a key field for use in updating a web FeatureLayer
    
    Args:
        in_table (str): The table to build an update mapping for
        fields (list[str]): A list of fields to pull values from (first value is used as the key)
        
    Returns:
        dict[str, dict[str, Any]]: A mapping of key values to row records ({'field': value, ...})
    """
    local_updates: dict[str, dict[str, Any]] = {}
    with arcpy.da.SearchCursor(in_table, fields) as cursor:
        for pin, *row in cursor:
            pin: str
            local_updates[pin] = dict(zip(cursor.fields[1:], row))
    return local_updates

def apply_updates(layer, key_field: str, updates: dict[str, dict[str, Any]], 
                  *,
                  where: str='1=1', 
                  batch_size: int=1000) -> list[Any]:
    """Apply updates to a layer filtered using the provided where clause
    
    Args:
        layer: An ArcGIS FeatureLayer object
        key_field: The field name to use for indexing into the update mapping
        where: A Query string for the layer
        updates: A mapping of record identifiers to desired row state (mapping of field to value)
    Returns:
        return value of the layer.edit_features call in a list (one per batch)
    """
    edits: list[Any] = []
    for feature in layer.query(where=where).features:
        attributes: dict[str, Any] = feature.attributes
        # Get the key for this feature
        join_key = attributes[key_field]
        
        # Skip rows with no update (look for the key in the updates)
        if (upd := updates.get(join_key)) is None:
            continue
        
        # Apply update to row
        updated = False
        for field_name, val in upd.items():
            if attributes[field_name] != val:
                attributes[field_name] = val
                updated = True
            
        # Add the update feature to the edit array
        if updated:
            edits.append(attributes)
    
    # Apply the edits in batches of batch_size to prevent errors
    return [
        layer.edit_features(updates=edits[batch:batch+batch_size]) 
        for batch in range(0, len(edits), batch_size)
    ]

def main():
    fc = r"Path\to\GDB\Table"
    agol = arcgis.GIS("home")
    item = agol.content.get("HostedFeatureService_ItemID")
    lyr = [lyr for lyr in item.layers if lyr.properties.name == "Water Meter"][0]
    
    fields = [
        "TaxID_PIN", # Key Field at index 0
        "HistoricalGooseneckMaterial", 
        "HistoricalMaintoCurbStop", 
        "PublicMaterialSource", 
        "HistoricalCurbStoptoHome", 
        "PrivateMaterialSource", 
        "HistoricalInteriorMaterial", 
        "InterialMaterialSource", 
        "FieldGooseneckMaterial", 
        "FieldMaintoCurbStop", 
        "FieldCurbStoptoHouse", 
        "FieldInterior", 
        "DateVerified", 
        "MethodofVerification",
    ]
    updates = gather_updates(fc, fields)
    results = apply_updates(lyr, fields[0], updates, batch_size=1000)
    for i, result in enumerate(results, start=1):
        print(f'Batch {i}: {result}')

if __name__ == '__main__':
    main()
0 Kudos