6

I am getting intermittent deadlocks when using HttpClient to send http requests and sometimes they are never returning back to await SendAsync in my code. I was able to figure out the thread handling the request internally in HttpClient/HttpClientHandler for some reason has a SynchronizationContext during the times it is deadlocking. I would like to figure out how the thread getting used ends up with a SynchronizationContext, when normally they don't have one. I would assume that whatever object is causing this SynchronizationContext to be set is also blocking on the Thread, which is causing the deadlock.

Would I be able to see anything relevant in the TPL ETW events?

How can I troubleshoot this?



Edit 2: The place that I have been noticing these deadlocks is in a wcf ServiceContract(see code below) inside of a windows service. The SynchronizationContext that is causing an issue is actually a WindowsFormsSynchronizationContext, which I assume is caused by some control getting created and not cleaned up properly (or something similar). I realize there almost certainly shouldn't be any windows forms stuff going on inside of a windows service, and I'm not saying I agree with how it's being used. However, I didn't write any of the code using it, and I can't just trivially go change all of the references.

Edit: here is an example of the general idea of the wcf service I was having a problem with. It's a simplified version, not the exact code:

[ServiceContract]
[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple)]
internal class SampleWcfService
{
    private readonly HttpMessageInvoker _invoker;

    public SampleWcfService(HttpMessageInvoker invoker)
    {
        _invoker = invoker;
    }
    
    [WebGet(UriTemplate = "*")]
    [OperationContract(AsyncPattern = true)]
    public async Task<Message> GetAsync()
    {
        var context = WebOperationContext.Current;
        using (var request = CreateNewRequestFromContext(context))
        {
            var response = await _invoker.SendAsync(request, CancellationToken.None).ConfigureAwait(false);
            var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().ConfigureAwait(false) : null;
            return StreamMessageHelper.CreateMessage(MessageVersion.None, "GETRESPONSE", stream ?? new MemoryStream());
        }
    }
}

Adding ConfigureAwait(false) to the 2 places above didn't completely fix my problem because a threadpool thread used to service a wcf request coming into here may already have a SynchronizationContext. In that case the request makes it all the way through this whole GetAsync method and returns. However, it still ends up deadlocked in System.ServiceModel.Dispatcher.TaskMethodInvoker, because in that microsoft code, it doesn't use ConfigureAwait(false) and I want to assume there is a good reason for that (for reference):

var returnValueTask = returnValue as Task;

if (returnValueTask != null)
{
    // Only return once the task has completed                        
    await returnValueTask;
}

It feels really wrong, but would converting this to using APM (Begin/End) instead of using Tasks fix this? Or, is the only fix to just correct the code that is not cleaning up its SynchronizationContext properly?

Community
  • 1
  • 1
CShark
  • 1,562
  • 1
  • 16
  • 27
  • Do you have any threads? How come you posted no code? – CodingYoshi Mar 16 '18 at 00:17
  • @CodingYoshi There's not much I can add that I haven't already said in the question. I was able to tell that the `SynchronizationContext` was set in the problem thread based on a process dump, but the thread doesn't have a managed stacktrace, only unmanaged stuff. – CShark Mar 16 '18 at 00:25
  • The Theory that something is initiating a SynchronizationContext and then waiting on it finishing (wich never happens) is solid. But without you showing us the code, there is only so much we can tell you. – Christopher Mar 16 '18 at 00:38
  • Are you awaiting all your Tasks all the way up or blocking at some point? If the latter, either change stop blocking and await or use ConfigureAwait(false) on your Tasks: https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html – ejohnson Mar 16 '18 at 04:14
  • @ejohnson i was able to get it to stop deadlocking by using ConfigureAwait(false). But I'm still trying to figure out how the `SynchronizationContext` is getting set in the first place, and then i'll decide based on that whether it makes sense to use the `ConfigureAwait(false)`. Also I wanna make sure adding the `ConfigureAwait(false)` won't add any unintended side effects, even though it appears to help this case. – CShark Mar 16 '18 at 17:56
  • What kind of an application is this? As I mentioned, I think somewhere upstream of your `HttpClient` code you are not awaiting a task, which until you added `ConfigureAwait(false)` was causing the deadlock when the async resumption attempted to grab the same context that was already held by the upstream thread. Here's some more reading: https://medium.com/bynder-tech/c-why-you-should-use-configureawait-false-in-your-library-code-d7837dce3d7f – ejohnson Mar 16 '18 at 18:11
  • @ejohnson It's a windows service. And I kind of figured what you said might be the case. And that's why in my question I mentioned ETW events, because there is a lot of code, and it's not easy to figure out where in the code this `SynchronizationContext` is being set at runtime. So I was hoping there was some way I could capture more info from the process to see the callstack before it's getting created, or something of that nature. – CShark Mar 16 '18 at 23:01
  • How do you call the SampleWcfService.GetAsync method? Do you call Task.Run or anything similar using this GetAsync method? – Grantly May 16 '18 at 23:30
  • @Grantly I don't specifically call it anywhere, it gets called internally by .Net code in `System.ServiceModel.Dispatcher.TaskMethodInvoker`. The code I posted isn't where the method gets called, [but it's right above that](https://github.com/Microsoft/referencesource/blob/master/System.ServiceModel/System/ServiceModel/Dispatcher/TaskMethodInvoker.cs#L216). And then in the code I posted it `await`s the `Task` that gets returned from invoking `GetAsync`. – CShark May 17 '18 at 02:37
  • Also, how do I know what to fix in this question from downvotes w/out comments or suggested edits? – CShark May 17 '18 at 02:59
  • Perhaps you're hitting the max of 2 concurrent connections per host limit. See https://stackoverflow.com/questions/4092206/limit-of-concurrent-httpwebrequests and https://stackoverflow.com/questions/10403944/does-httpwebrequests-limit-of-2-connections-per-host-apply-to-httpclient – David Moore May 17 '18 at 03:39
  • I'm manually setting that value to higher than 2 and there were no warnings in my wcf traces related to hitting the max connections limit. So, I don't think that is related, but thank you for the suggestion. Even if it was lower than it should be, I'm pretty sure that wouldn't cause it to deadlock like this. I would think it would just make each request take longer to return back to the client. – CShark May 17 '18 at 06:54

2 Answers2

4

Update: we now know we're dealing with a WindowsFormsSynchronizationContext (see comments), for whatever reason in a WCF application. It's no surprise then to see deadlocks since the point of that SyncContext is to run all continuations on the same thread.

You could try to to set WindowsFormsSynchronizationContext.AutoInstall to false. According to its docs, what it does is:

Gets or sets a value indicating whether the WindowsFormsSynchronizationContext is installed when a control is created

Assuming someone creates a WindowsForms control somewhere in your app, then that might be your issue and would potentially be solved by disabling this setting.

An alternative to get rid of an existing SynchronizationContext would be to just overwrite it with null, and later restoring it (if you're nice). This article describes this approach and provides a convenient SynchronizationContextRemover implementation you could use.

However, this probably won't work if the SyncContext is created by some library methods you use. I'm not aware of a way to prevent a SyncContext from being overwritten, so setting a dummy context won't help either.


Are you sure the SynchronizationContext is actually at fault here?

From this MSDN magazine article:

Default (ThreadPool) SynchronizationContext (mscorlib.dll: System.Threading)
The default SynchronizationContext is a default-constructed SynchronizationContext object. By convention, if a thread’s current SynchronizationContext is null, then it implicitly has a default SynchronizationContext.

The default SynchronizationContext queues its asynchronous delegates to the ThreadPool but executes its synchronous delegates directly on the calling thread. Therefore, its context covers all ThreadPool threads as well as any thread that calls Send. The context “borrows” threads that call Send, bringing them into its context until the delegate completes. In this sense, the default context may include any thread in the process.

The default SynchronizationContext is applied to ThreadPool threads unless the code is hosted by ASP.NET. The default SynchronizationContext is also implicitly applied to explicit child threads (instances of the Thread class) unless the child thread sets its own SynchronizationContext.

If the SynchronizationContext you are seeing is the default one, it should be fine (or rather, you will have a very hard time to avoid it being used).

Can't you provide more details / code about what's involved?

One thing that looks immediately suspicious to me in your code (though it may be completely fine) is that you have a using block that captures a static WebOperationContext.Current in request, which will both be captured by the generated async state machine. Again, might be fine, but there's a lot of potential for deadlocks here if something waits on WebOperationContext

enzi
  • 4,057
  • 3
  • 35
  • 53
  • As far as the `using` in relation to be `WebOperationContext`, all it's doing is copying some stuff from that to a newly created `HttpRequestMessage`, I don't think that should cause a problem. – CShark May 11 '18 at 17:01
  • Also, I believe I have good reason for thinking that the `SynchronizationContext` is the problem. based on tracing I added, normally threads come through there with a null `SynchronizationContext`, and only the ones that have `SynchronizationContext.Current` not `null` are hanging at the place I mentioned in my question: `await returnValueTask;` – CShark May 11 '18 at 17:03
  • And I'm not saying that there isn't a bug in our code somewhere that is causing this `SynchronizationContext` to come up intermittently, because there probably is. I'm trying to figure out if I can adjust this wcf service and adjacent code to workaround the issue (or figure out if there is something inherently wrong with the implementation), since none of this code I posted is actually using the thread's `SynchronizationContext` anyways. – CShark May 11 '18 at 17:35
  • What do you mean by "waits on `WebOperationContext`"? – CShark May 11 '18 at 23:58
  • All I'm saying is try to figure out if the added `SynchronizationContext` is the default one (it's just called "SynchronizationContext", not e.g. `AspNetSynchronizationContext`). As mentioned in the article, this context can be set by various things if none is already present. Which means it will be hard to track down where it is introduced and also hard to avoid it being added. – enzi May 13 '18 at 11:30
  • My final thought on `WebOperationContext.Current` is just that it is static and thus likely shared by other operations, as well as being captured by the async state machine. This is not a problem in itself, but if there are critical resources in it that can cause deadlocks or race conditions e.g. by locking on them, then it can be difficult to track down where exactly that happens. Again, this is just guesswork because we can't see the full picture. I'm merely pointing out that there are other areas that can cause your problem. – enzi May 13 '18 at 11:38
  • It's actually a `WindowsFormsSynchronizationContext`. – CShark May 14 '18 at 00:05
  • In other words, it's not the default one. I have process/heap dumps to prove this and some other tracing info, I just don't think I necessarily want to directly post the files here, because it may have sensitive data. But, I can take screenshots. – CShark May 14 '18 at 18:37
  • @CShark how does a WinForms SyncContext get in there? I updated my answer with some suggestions. – enzi May 15 '18 at 07:11
  • Yeah, that's a good question. I can see the places where it is being referenced with `using System.Windows.Forms`, but this is a big codebase, and it's non-trivial to try to figure out which of those places might be creating a `WindowsFormsSynchronizationContext`, and more importantly when. And that's the struggle of this whole issue. I don't think that windows forms should be referenced anywhere in the codebase for this service, but that's a whole other problem that I don't want to get into. – CShark May 15 '18 at 15:55
  • TL;DR The main issue is the winforms assembly is referenced in a lot of files. – CShark May 15 '18 at 16:07
  • Have you tried turning off that [AutoInstall](https://learn.microsoft.com/en-ie/dotnet/api/system.windows.forms.windowsformssynchronizationcontext.autoinstall?view=netframework-4.7.2) property? According to its docs, simply creating a control is enough to create a WinForms SyncContext, which the property prevents. – enzi May 15 '18 at 18:57
  • Thank you for the suggestion. I have not tried turning it off yet, but I will when I get a chance. – CShark May 15 '18 at 21:09
  • I'm also trying to figure out if that would cause any other side effects – CShark May 16 '18 at 00:51
  • It appears setting `WindowsFormsSynchronizationContext.AutoInstall` to `false` seems to work in the testing I've done. I'm a little hesitant though, because I'm not sure what effect it might have on whatever windows forms stuff is happening in the background at runtime. And this issue is not the easiest to reproduce. – CShark May 17 '18 at 02:48
2

Try below; I have found success in similar cases getting into the async rabbit hole.

var responsebytes = await response.Content.ReadAsByteArrayAsync();
MemoryStream stream = new MemoryStream(filebytes);

Response the stream variable.

Hope it helps.

Jonny Boy
  • 192
  • 1
  • 1
  • 10
  • I'll try it when I get a chance – CShark May 10 '18 at 21:52
  • 1
    Do you mean `new MemoryStream(responsebytes)`? – CShark May 11 '18 at 01:14
  • This doesn't change anything; I mean `ReadAsStreamAsync` is essentially doing the same thing internally, with some buffering added. – CShark May 11 '18 at 17:07
  • @CShark Yes sorry, new MemoryStream(responsebytes). I miss-typed. In your code sample you are using two awaits, which is right but try: var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().Result : null; You will still have one await and which should be sufficient for the async method. If this does not work - I am done and sorry. Thank you for trying though. – Jonny Boy May 11 '18 at 17:20
  • First off, this doesn't compile: `var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().Result : null;`. You are trying to `await` the Result in this code, instead of a `Task`. Also, making it synchronous won't even avoid the deadlock that I mentioned, and could even introduce more problems. – CShark May 17 '18 at 02:40