Select to view content in your preferred language

Improve error response when an image service goes into a bad state

273
1
04-02-2024 12:23 PM
Status: Open
Labels (1)
BradBerry_FMG
New Contributor II

Our organization uses image services as part of a geoenrichment API to fetch pixel attribute values for coordinate locations. Internal applications make calls via the GetSamples request method with groups of coordinate point locations to obtain the corresponding pixel values. Since these applications are used to carry out critical business operations, the API provide high reliability, performance, and accuracy. 

We've published these image services from mosaic datasets that reside in file geodatabases stored on a Windows file server. Our Image Server architecture is Windows multi-machine, with 2 or 3 nodes depending on environment.

Over the last year we have encountered periodic, intermittent failures of the image service GetSamples calls. When the issue occurs, some responses are successful and others fail. This led us to learn that only one node is affected by the problem, confirmed by sending the same, valid GetSamples request to each backend serverFQDN:6443/arcgis endpoint and getting consistent failed responses from only one server, and consistent successful responses from the healthy node(s). The failed response is a 200 HTTP response with the following JSON message body:

		{
		 "error": {
		  "code": 400,
		  "extendedCode": -2147024809,
		  "message": "Invalid or missing input parameters.",
		  "details": [
		   "General function failure"
		  ]
		 }
}

The root cause of the issue has been difficult to trace, but we believe it may be caused a brief disruption or corruption of the connection to the file share server that stores the FGDB mosaic dataset. When Tech Support logged a bug for this issue, the product development team closed it as a known limit that a service's connection to a file-based data source is not self-healing (unlike EGDB-referenced services). 

Ultimately, this Idea is to improve the error response returned by the image server when this problem occurs. The current error response is both inaccurate and ambiguous. The error should be a 500 code, rather than 400, because it's a server failure not an error with the request parameters. Secondly, it's difficult for our API and applications to disambiguate this error from a real issue of invalid input parameters.

This Idea, an improved error response for a failed service, is essential for our organization to provide reliable applications that rely on ArcGIS Image Services and would impact many other organizations in the same position.

1 Comment
BradBerry_FMG

I'd like to add that an enhancement request corresponding to this Idea has been logged via Esri Tech Support: ENH-000166201 - The getSamples operation is performed on an image service should provide accurate request response when data source coming from filegeodatabase.