GetDataAsync stops at 100 seconds

JoeHershman · ‎08-24-2017

We are downloading items from Portal (zip files of replicas). It would seem from testing that the download runs for 100 seconds and then just moves on. A stream is returned, however it is corrupt because it is not the complete file stream. No exceptions are returned to the calling method until you attempt to use the stream.

We tried adjusting the httpRuntime tag in web.config in the portal folder in IIS thinking that might have some control

<httpRuntime maxUrlLength="2000" maxRequestLength="2097151" executionTimeout="100000" targetFramework="4.5" />

Change to

<httpRuntime maxUrlLength="2000" maxRequestLength="2097151" executionTimeout="200000" targetFramework="4.5" />‍‍‍‍‍‍‍‍‍‍

Which had no impact. It seems like there is something in the internal HttpClient call in the API that stops at 100 seconds. We have remote users downloading large file which we expect would take more than 100 seconds so it is pretty critical to be able to extend out this time.

Thoughts?

mnielsen-esristaff‌

akajanus-esristaff‌

Thanks,
-Joe

dotMorten_esri · ‎08-24-2017

The timeout is most likely happening at the client and not the server (that's about the default response timeout). However if data has started to trickle down, it shouldn't time out. The timeout controls how long to wait before starting to get a response from the server. Are you saying the server takes a very long time before starting to return the data?

JoeHershman · ‎08-25-2017

I see some really odd behavior using HttpClient. I wrote an application just using the HttpClient class with tons of logging. Based on some dotPeek, would seem I am basically doing what the API does.

What it logging shows is that the

Stream stream = await HttpClient:GetStreamAsync() ‍‍

and passing the https://..../sharing/rest/content/ itemId/data url returns before the stream is completely downloaded. This method returned in about 700 ms. However, as I copy the stream to a file it times out at this mystery 100 seconds mark with the same error the API throws (this only occurs on remote slow connections, internal connections all go fast enough)

System.IO.IOException: The read operation failed, see inner exception. ---> System.Net.WebException: The request was aborted: The request was canceled.
   at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Cache.ForwardingReadStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.Http.DelegatingStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.InternalCopyTo(Stream destination, Int32 bufferSize)
   at System.IO.Stream.CopyTo(Stream destination)
   at HttpDownloader.Downloader.<DownloadData>d__2.MoveNext()‍‍‍‍‍‍‍‍‍‍

The only thing that makes sense based on what I see is that the stream has really not downloaded completely when HttpClient:GetStreamAsync() moves on to the next line, but is actually continuing to stream data across the wire as the file is being written. This is contrary to my understanding of how the HttpClient async methods are supposed to work. My understanding was that the method does not return back to the caller method until the end of the stream being downloaded.

Just for yucks, I went old school and used WebClient to do a synchronous download. In this case everything worked fine. One files took over 2 minutes to download was successful.

My conclusion, I have absolutely no idea. The logging I see does not make sense with my understanding of how async/await is supposed to work. What happens at this magical 100 seconds, I have no clue. Since the download actually occurs at application start up (downloading offline replicas), we may just go with the synchronous download and let the UI be frozen.

Thanks,
-Joe

dotMorten_esri · ‎08-25-2017

When you perform a request, you can choose whether you want your app to return when the headers has been read, or when the entire response has been read. We generally do the first thing - otherwise very large responses would have to be put into memory. Instead this gives you the option to stream the file straight to disk and significantly reduce the memory consumption used. You can actually see that GetStreamAsync explicitly returns once the header is read right here:

https://github.com/dotnet/corefx/blob/43c08c9e233647e1928a1758afddaf3161309f7e/src/System.Net.Http/s...

The 100s timeout you're finding matches the implementation:

https://github.com/dotnet/corefx/blob/43c08c9e233647e1928a1758afddaf3161309f7e/src/System.Net.Http/s...

Also judging from this bit, timeout should only pertain to when the response headers were read:

https://github.com/dotnet/corefx/blob/43c08c9e233647e1928a1758afddaf3161309f7e/src/System.Net.Http/s...

I tried reproducing it in a simple console app. I'm definitely seeing the timeout having an effect when using ResponseContentRead, but it doesn't timeout when using ResponseHeadersRead, and happily chugs along loading the data that trickles in.

See my test console app here (I used .NET 4.5.2):

https://gist.github.com/dotMorten/f195b4aea93d812574db10d1509c0fcb

It's a simple webserver that serves 1byte/ms of 10mb of data, so it'll take way too long to download within the 100s limit.

JoeHershman · ‎08-28-2017

Wow...this is incredibly useful and gives me a much greater understanding how this all works. I appreciate your taking the time to go into so much detail

Thanks,
-Joe