Unexplained behaviour when upserting a hosted feature layer in AGO with UI and API

StijnSchoutenTheGISLord · ‎11-07-2023

Hello everyone,

I'm a bit stumped on this one. Especially the fact that the AGO returns zero errors when upserting an hosted feature layer with new/updated features. I'll try to be as thorough as possible:

I have a total of 31058 features I want to upload to a hosted feature layer. These features are split between two files (20027 & 11031 features each) . I'm trying two methods: API (preferred) and via the UI (for testing). With both methods the total nr of features after upserting is less then the 31058 and I have no idea why...

Step 1:

In the attached .zip (it is all public data) there is a .json file called 'initial.json'. Im using this as a sort of schema definition for the layer. I'm uploading this to AGO and publishing the layer as hosted feature layer. So far so good

(single feature)

Step 2a (UI Route):

I'm using the "Update Data" button of the hosted feature layer:

Uploading "update1.geojson" and setting the identifier as such:

And letting AGO do the mapping for me

I click <next> and select the option to update Attributes and Geometries

AGO will then process the file and mention that the layer has been successfully updated with 11031 new features:

When i check the data tab it is reading 11032 features! So this is also going well:

I do the same for "update2.geojson". With also the same success message:

So logically the total count should be 31058 right? Wrong:

I've checked that the objectid's are unique and that the geometries are valid. I've checked this with JSON To Features in ArcGIS Pro.

Swapping the order (first update2 then update1) results in a slightly different total amount:

As it first fully uploads the first geojson file and then partially uploads the second geojson file. So it seems that the problem is not in the quality of the geojson files but in the order which seems even stranger to me.

Step 2b (API Route).
The results are the same when using the Python SDK API

def upsert_layer(gis_engine: gis.GIS, uploaded_item: str, layer: str):
    search_result = gis_engine.content.search(query=f'title: "{layer}" AND type: "Feature Service"')
    if len(search_result) == 0:
        gis_engine.content.delete_items([uploaded_item])
        raise RuntimeError(f"Layer: {layer} can not be found")
    if len(search_result) > 1:
        gis_engine.content.delete_items([uploaded_item])
        raise RuntimeError(f"More then 1 layer found for layer: {layer}")
    item = gis_engine.content.get(search_result[0].id)
    print(item.layers[0].append(uploaded_item, upload_format='geojson', upsert=True, upsert_matching_field='OBJECTID', rollback=True, return_messages=True))
    time.sleep(2) # trying not to hammer the server
    gis_engine.content.delete_items([uploaded_item])

And the result of the print statement is:

(True, {'layerName': 'AE_GEODATA_PRODUCTS_OSM_ZONNEPARKEN_AGO_DEV_points', 'submissionTime': 1699350508877, 'lastUpdatedTime': 1699350520370, 'recordCount': 20027, 'status': 'Completed'})
(True, {'layerName': 'AE_GEODATA_PRODUCTS_OSM_ZONNEPARKEN_AGO_DEV_points', 'submissionTime': 1699350538790, 'lastUpdatedTime': 1699350546083, 'recordCount': 11031, 'status': 'Completed'})

Any suggestions or members willing to test this also is greatly appreciated

timcneil · ‎11-07-2023

I ran through your repro case above, and if the ID field is used as the unique identifier (instead of system-generated OBJECTID), the number of features in the layer after upserting will be correct (31058).

As a best practice, we don't recommend using the OBJECTID field as a unique identifier for updating data. While it can work in some scenarios, there are some cases where the values will not be as expected. For example, if the layer is truncated the OBJECTIDs for the newly added features will not start at 1. They will pick up on the next value after the truncated feature OBJECTIDs.

View solution in original post

MiguelParedes · ‎11-08-2023

Greetings @timcneil
I was not aware of the recommendation. Is this documented anywhere? If not, this could be included as an enhancement to the documentation, don't you think?

Miguel

View solution in original post

timcneil · ‎11-07-2023

Hi StijnSchoutenTheGISLord,

I'm sorry to hear you're experiencing an issue.

Is this behaviour also observed when you match on a different field with a unique index that is not the OBJECTID?

timcneil · ‎11-07-2023

I ran through your repro case above, and if the ID field is used as the unique identifier (instead of system-generated OBJECTID), the number of features in the layer after upserting will be correct (31058).

As a best practice, we don't recommend using the OBJECTID field as a unique identifier for updating data. While it can work in some scenarios, there are some cases where the values will not be as expected. For example, if the layer is truncated the OBJECTIDs for the newly added features will not start at 1. They will pick up on the next value after the truncated feature OBJECTIDs.

MiguelParedes · ‎11-08-2023

Greetings @timcneil
I was not aware of the recommendation. Is this documented anywhere? If not, this could be included as an enhancement to the documentation, don't you think?

Miguel

timcneil · ‎11-08-2023

Hi @MiguelParedes,

We do plan to include this recommendation in the documentation. I have submitted an enhancement request.

Thanks,

Taylor

StijnSchoutenTheGISLord · ‎11-08-2023

Hi @timcneil & @MiguelParedes

Thank you very much for responding! I've changed the OBJECTID to some other identification and it worked:

Happy with result and response, thanks!!

I was not aware of the standard/recommendation so nice to see the plan to get it included in the documentation.