-2

As the title suggests, I have this highly unusual situation that I just can't explain so as a last resort I am posting it here.

It's best I illustrate it with code right away.

I have the following situation somewhere in my IHttpHandler implementation:

var requestData = /*prepare some request data for result service...*/;

//Call the service and block while waiting for the result
MemoryStream result = libraryClient.FetchResultFromService(requestData).Result;

//write the data back in the form of the arraybuffer
HttpContext.Current.Response.BinaryWrite(CreateBinaryResponse(result));

FetchResultFromService is a asynchronous library method that uses HttpClient to retrieve the result from a remote service. Simplified version looks like this:

public async Task<MemoryStream> FetchResultFromService<T>(T request)
{
   ...//resolve url, formatter and media type

   //fire the request using HttpClient
   HttpResponseMessage response = await this.client.PostAsync(
                                               url,
                                               request,
                                               formatter,
                                               mediaType);
   if (!response.IsSuccessStatusCode)
   {
      throw new Exception("Exception occurred!");
   }

    //return as memory stream
    var result = new MemoryStream();
    await response.Content.CopyToAsync(result);
    result.Position = 0;
    return result;
}

Issue happens when FetchResultFromService takes more than 2 mins to respond. If that occurs, IIS aborts the blocked thread(IIS timeout is set to 120sec) with an ThreadAbortException. Client(browser) gets the error response and everything seems okay for the moment, but short while after that performance of the app drops and response times skyrocket.

Looking at the logs of the "Library service" it's clear that the moment when service is finally done(2+ minutes) and returns the response is the moment when the anomaly on the webserver occurs. The issue usually resolves itself a minute or two after it starts.

Hopefully, the title is now clearer. When there is a continuation of the task being executed AFTER the caller thread finished the request whole application suffers.

If I modify the library code with ConfigureAwait(false) this issue doesn't occur.

HttpResponseMessage response = await this.client.PostAsync(
                                               url,
                                               request,
                                               formatter,
                                               mediaType).ConfigureAwait(false);//this fixes the issue

So now it looks like it's related to context synchronization. The application is using the old LegacyAspNetSynchronizationContext which does locking over HttpAplication objects but I can't understand how this can affect all threads/requests.

Now I understand that there are a lot of bad practices illustrated above. I am aware of articles like Don't block on Async Code and while I am grateful for all responses I get, I am more looking for an explanation of how can one single async request bring an entire application to its knees in this situation.

More information

  • We are not talking about deadlocks because of the blocking. The above situation executes without an issue if the service manages to return the result in those 2 minutes that IIS allows it to do so. The issue can even be reproduced if you don't block and just invoke the library method without calling the Result getter .
  • Removing second await(the MemoryStream copying) reduces reproduction rate by a lot, but it's still achievable with multiple Library FetchResultFromService calls fired at the same time. ConfigureAwait(false) is currently the only reliable "solution".
  • I've even caught IIS even apparently dropping ACTIVE connections. I've been using JMeter to stress test the application and when the anomaly happens JMeter threads(fake users) report that connection was reset. While manually testing with Chrome I've noticed that drop as well with ERR_CONNECTION_RESET response. These resets happen regularly when the application is under load and anomaly occurs, but as mentioned above anomaly always manifests like a performance hit for most of the active users.
  • The Application is a WebForms app which implements and uses a custom synchronousIHttpHandler for most of its requests. The Handler implementation is declared as IRequiresSessionState promoting exclusive session access. I am mentioning this since I have a hunch that this might be related with session states although I can't confirm it. Or maybe it's related to calling async methods from sync handlers, although still would like to know how.
  • IIS version is 7 and IIS logs don't show anything useful(or at all).

I am out of ideas for testing, hopefully someone can at least hint me in some direction. I would really love to understand this. Thanks!

Smiki
  • 19
  • 5
  • 1
    the moment you have a sync-over-async, all bets are off; can you not just change the calling code to use `await` rather than `.Result`? – Marc Gravell Mar 28 '19 at 17:01
  • As @MarcGravell pointed out chewing up all available threads with synchronous waiting of async operation is expected behavior. To confirm you should look at "threads" view (or list threads in minidump if you can't do live debugging) to see all of them waiting and there are no free threads to handle continuation. – Alexei Levenkov Mar 28 '19 at 17:06
  • @MarcGravell Yea, I was afraid this is some undefined behaviour stuff. I am just looking into this as more of a research subject atm. I understand that `.Result` here is far from safe. – Smiki Mar 28 '19 at 17:08
  • The issue here is that bug is also occurring even if I don't call the `.Result`.(see More Information part) Technically then there is no one waiting right? – Smiki Mar 28 '19 at 17:14

2 Answers2

1

The semantics of using async/await with LegacyAspNetSynchronizationContext is literally undefined. In other words, there are tons of edge cases and problems here, to the extent that you can't expect anything to work at all.

In particular, the thread abort also tears down that SynchronizationContext, and then when that await resumes, it resumes on that SynchronizationContext. So, not only is it a synchronization context that doesn't work appropriately with await, but it's even in a partially disposed and/or reused state. Interfering with other request contexts is entirely within the scope of possibility.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
0

I unfortunately completely missed an unhandled exception that was responsible for bringing the application down and I wasn't aware of it:

Exception Info: System.NullReferenceException at System.Web.ThreadContext.AssociateWithCurrentThread(Boolean) at System.Web.HttpApplication.OnThreadEnterPrivate(Boolean) at System.Web.LegacyAspNetSynchronizationContext.CallCallbackPossiblyUnderLock(System.Threading.SendOrPostCallback, System.Object) at System.Web.LegacyAspNetSynchronizationContext.CallCallback(System.Threading.SendOrPostCallback, System.Object) at System.Web.LegacyAspNetSynchronizationContext.Post(System.Threading.SendOrPostCallback, System.Object) at System.Threading.Tasks.SynchronizationContextAwaitTaskContinuation.PostAction(System.Object) at System.Threading.Tasks.AwaitTaskContinuation.RunCallback(System.Threading.ContextCallback, System.Object, System.Threading.Tasks.Task ByRef) at System.Threading.Tasks.AwaitTaskContinuation+<>c.b__18_0(System.Object) at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()

The application wasn't capable of logging it, and IIS auto restarted it so I missed it. Essentially this became a "fire and forget" issue mentioned here since thread aborted.

Hopefully this unhandled exception can't happen with the proper use of async-await and the task friendly synchronization context.

Smiki
  • 19
  • 5