Yesterday (2021-07-29) beginning around 3pm EST all the Survey123 submissions through Integromat that generate feature reports began failing due to a time out error. This was not a single survey, but multiple different surveys using different feature reports.
I went directly to the Survey123 web site and tried creating the reports manually from there. That failed too, and continued to fail all the way up until around 7:30pm EST when it began working again. There were 23 surveys backed up in the Integromat queue.
Clearly there was an issue with one of the hosting processes at ESRI that was failing to process these feature reports from the Survey123 web site since trying to generate any of them manually from the web site failed.
There were (3) reports that were skipped – never even made it to Integromat since the trigger never executed because the Survey123 site must have been unreachable. I had to identify and process those ones manually.
Yet, the ArcGIS Online Health dashboard showed all green – no issues - and still does the following day.
That dashboard seems to be pretty much useless. It would be nice if it actually identified issues when they happened in real-time, but even after the fact when they don’t show up is even more frustrating – like there never was an issue.
I'm opening a case with ESRI on this issue to hopefully generate some action. We have many people depending on these reports and we hear about it immediately when they don't receive them. It is understood there will be outages and problems from time to time, but having no acknowledgement of the issue by ESRI so we can pass along the information while waiting for a resolution is unacceptable.
Just stopping by to say we had a similar issue (This morning I did go into Integromat and was able to run it successfully). We have been pretty frustrated with survey123/integromat failing randomly.
I have several scenarios for different projects and I have issues almost weekly with each scenario. I also had scenarios stop yesterday afternoon. My most common errors are:
"Maximum number of repeats exceeded" - I think the solution to this (time will tell) is to increase the number of repeats in the scenario setting. I am under the impression that this happens when multiple surveys are submitted in short order, thus jamming up Integromat.
"The operation timed out" - I have not found a solution to this one, as the OP mentioned is likely due to a hosting issue.
"Missing value of required parameter 'URL'" - I have tried a few things to handle this error, but with limited success. My understanding is that the URL that I am requesting in an HTTP module to retrieve the PDF feature report (that gets uploaded to a OneDrive folder) is probably taking too long to generate and thus a URL is not being passed. I have tried adding a "Sleep" module to pause the scenario for 30-60 seconds to give the Create Report module time to process. That seems to work most of the time, but if paused too long, it will time out.
FYI - ESRI Support was helpful and responded to the case I opened with this reply:
"I can confirm that there were performance issues with generating reports in Survey123 yesterday from 3:15 PM to 7:00 PM EDT and I have attached this issue to our case.
In regards to the performance issue not appearing in the ArcGIS Online Health Dashboard, currently Survey123 is not a product listed in the Health Dashboard. However, I communicated your feedback with out internal resources and they have included your situation in a planning document for future improvements to the ArcGIS Online Health Dashboard.
We greatly appreciate your feedback and I completely understand your frustration with not seeing a public explanation for the issue your organization was experiencing. Your feedback will be a great help in improving our communication moving forward and I believe will help many users."
Hi @AdminAccount2 ,
Thank you for reaching to us. I also logged your voice in the backlog of the Survey123 team. We are considering providing the health status for the report service in a future release, will keep you updated in this post.
In the meantime, would you mind sharing the Esri Support Service ticket number so that others can also attach their cases to prioritize this enhancement?
I appreciate you keeping me updated on any progress.
The ESRI Support Ticket information is:
Esri Case #02860695 - AGOL Survey123 Outage not reported on ESRI Health Dashboard