Select to view content in your preferred language

recent increase in timeouts from AGOL hosted feature service

338
3
a month ago
Labels (2)
coryeicher
Frequent Contributor

Starting 2 days ago we began seeing an increase in timeouts from an AGOL hosted feature service.

Details:

We use an AGOL feature service to support project operations. We use the ArcGIS Python API method `FeatureService.ExtractChanges()` to return updates for 1 layer (so updates only, not adds or deletes).

Our process has been operational for some time (1.5 years). Periodically we do see this API call time out (more than 3 minutes). Steady state this happens 1-2 times a week...

... However, starting 2026-04-28 this started happening much more frequently. We still see successes, but we are seeing timeouts 5-6 times a day.

  • Questions: Who else is experiencing increased timeouts?
    • What recommendations are there for addressing this on our end...?
    • What can Esri do to diagnose further?

Info: We obtain the input "server_gens" (start/end states) from feature layer web hooks. My one thought is to drop and re-create the webhhooks, but those are upstream of the issue, so taking these steps is unlikely to help.

Thanks,

-Cory

 

CORY EICHER
www.eichcorp.com
cory@eichcorp.com
0 Kudos
3 Replies
George_Thompson
Esri Notable Contributor

In these instances, I highly recommend reaching out to Technical Support to investigate further. They may be able to help understand and diagnosis any issues with ArcGIS Online.

--- George T.
coryeicher
Frequent Contributor

Thanks! I am still hoping to hear here from the community/Esri on recent reported degradations in performance for similar AGOL feature service requests.

 

 

CORY EICHER
www.eichcorp.com
cory@eichcorp.com
0 Kudos
coryeicher
Frequent Contributor

Update: We submitted an incident with Esri support.

We collected and submitted data from our GCP system logs (from which we call `/extract_changes`). Data showed that that the timeout frequency increased after April 28. The failure frequency measured north of 25% during the week+ following that date. Prior to that it was much lower. No explanation for this increase from the Esri team.

Our logic is triggered by a Esri feature layer webhooks. Esri support suggested to try extending our WH `recurrenceInfo.frequency` from 30 seconds to 60 seconds. After changing our failure rate did reduce, to ~3% failure rate. We are monitoring.

Esri subsequently recommended to allow `/extract_changes` to run longer than 3 minutes (we are currently terminating our GCP process after 3 minutes). We plan to extend this and measure whether this reduces timeouts (I am not optimistic).

Lastly, we have also discussed with Esri support the possibility of us implementing retries on our side. We may consider this at some point. We spot checked some timeouts, and when run from Postman after-the-fact, they all passed.Our testing has shown that failed requests will pass later.

Will update when I have more info.

-Cory

CORY EICHER
www.eichcorp.com
cory@eichcorp.com
0 Kudos