How reliable are ArcGIS Online webhooks?
For example, a Survey123 webhook triggered by a feature being created, that uses Power Automate to send an email or create a feature in a different feature layer.
Can I rely on the webhooks to fire 100% of the time?
This isn’t strictly about reliability, but it talks about timing, which seems relevant:
Why is there such a big delay between a trigger and the action?
While some flows can be instant, there are many cases in which there is a delay between a trigger and an action and vice versa. For example, when using the ArcGIS connector in conjunction with a Microsoft SharePoint connector, there is a delay (one to three minutes) when processing information.
https://doc.arcgis.com/en/power-automate/latest/get-started/faqs.htm
A few related posts:
@Bud Thanks for your question.
All the triggers related to Features/Attachments created, updated, deleted are server-side triggers. Currently, `when a survey response is submitted` trigger uses client-side approach (which may change in future) that are almost instant. S123 trigger is reliable for the amount of usage they get from all our customers. Yes, it has its shortcomings and issue 5 mentioned above is I think related to having too many fields in the request which Power Automate doesn't like.
In the past, we have had issues with server-side webhooks through ArcGIS Online being slow to trigger. Many of our users have previously complained that they didn't run reliably. Some of the issues issues you posted are associated with that, while some of them were related to outages in the Azure and our connector back-end last year. Others in-fact, were related to the webhooks performance.
That said, we have worked with ArcGIS Online team and they have resolved the performance issues in the recently released ArcGIS Online version 2026.1. This has internally rehauled the architecture of how the webhooks are triggered reducing previous bottlenecks. From my testing, the current server-side webhooks trigger within1-3 minutes. Please let us know if you experience is otherwise.
If this isn't acceptable time frame then there are ways to bypass the webhooks altogether in automation. That would add some overhead creating and maintaining the flows. It may or may not be for everyone.
So, can you rely on webhooks 100% of the time? I would say any technological system isn't flawless and there will be bugs and errors. Just because we have self-driving cars doesn't mean we completely let go of the steering; well at least not yet. I would recommend designing smart flows and think about how you can get notified and handle errors if and when they occur. This can include, Sending an email if the `Fetch updates, changes, deletion action` fails or setting up retries on actions if they fail the first time, etc.
Lastly, we are here to help and improve, so please keep posting your thoughts, questions and comments.
Here is a webhook troubleshooting guide our developer has put together if you wish to narrow down some of the common problems we have seen in the past
Thanks, Akshay.
On a side note, a contact had this take on webhooks:
I have used webhooks in the past, but I always used them slightly differently from how they were designed. I just use the webhook to trigger a workflow that uses a query; this way, you pick up things that misfired. I also run the workflow on a schedule to pick up things should it all go bad. I have had IT suddenly block the incoming, ESRI side broke, etc., and users never noticed. I had the workflow logging processed records and what fired. If you watch those, you quickly see that a webhook has broken.
I think they call what I do a tickler rather than a webhook. I just use it as a tickler even though it is designed to send the data in the hook.
A different contact said this:
I'm guessing (keyword) that the webhook is relatively reliable. I'd be more concerned about setting something up that depended on a connection to the outside world without some kind of client side cache in place (e.g. what happens if an event occurs that would trigger a AGO webhook, there isn't any network connection to fire off that event through? Personally, when building mobile (or any disconnected sort of system), I always like to write new data to a local location and have a background thread that runs which polls the data location and sends any required updates to the server system when a connection is available. The background thread's only job is to constantly try to get data from the local storage to the server and no data is removed from the local storage until the successful transfer to the server is acknowledged. That may be overkill, but I don't like losing data.
ArcGIS Architecture Center > Webhooks
Webhooks are generally considered an effective, but not foolproof integration method. As they are loosely coupled to the destination endpoint, there is no guarantee that a message from the source system will reach the destination. Issues like network outages, a failed endpoint, or a badly constructed message body could cause that message to fail.
This means that webhooks are not perfectly reliable, though not significantly different from any other REST API or endpoint which could suffer from the same reliability challenges.
Webhooks require line of sight network connectivity. When an event occurs, webhooks are sent immediately, and if any network access or outage interrupts this request, it can mean that the payload never reaches the final destination. While some systems support automatic retry logic to re-send a webhook payload, this cannot always be relied on, and may result in lost messages in inconsistent network conditions.
Webhooks do not guarantee delivery. While retry intervals and attempts can try to achieve a higher rate of success, webhooks do not guarantee that every event or trigger will result in data reaching the destination system. While this can be acceptable for many workflows, consider whether a post-process to validate data completeness may be required to guarantee that all events were properly processed.
@Bud I am not sure what does the contact of yours mean by misfires. Just to be clear ArcGIS Power Automate connectors has nothing to do with webhooks being slow, not triggering or misfires. We are open to investigation if they have reproducible run.
I have seen many of our users use the use case you mentioned successfully with our triggers. You can choose to use the methods that works best for you. Like I said, workflow design is crucial when it comes to Power Automate.
Please let me know if your services that see a delay in webhooks notification. We have been scaling the webhooks infrastructures to handle the load. But will be good if you can share a url of a service that sees a delay in webhooks notification. We would expect notification to be between 1-5 mins.
A related comment from @abureaux in Power Automate workflow using Survey123 Connector from 2023:
I have used Power Automate to automate S123 for a (newly) multinational company for the last ~2.5 years. I have ~300 automations set up for various purposes. A large number of those automations are rather complex as to support high-level business operations.
I have encountered the odd issue/error that causes unexpected/unexplained failures, but some simple error handling auto-corrects those. The biggest hurdle I've encountered was when Esri releases updates that break things. They typically fix those mistakes with ~24 hours. Definitely the most stressful times...
Overall, I am quite happy with the level of redundancy, security, and reliability. While I haven't done an official up-time calculation, we are definitely way up there since my record of downtimes is almost non-existant this year (beyond my regular EOM maintenance).
They really simplified the automation process for Enterprise this year as well, which is nice.