4

I have a very strange problem. My WebClient.DownloadDataCompleted doesn't fire most of the time.

I am using this class:

public class ParallelFilesDownloader
{
    public Task DownloadFilesAsync(IEnumerable<Tuple<Uri, Stream>> files, CancellationToken cancellationToken)
    {
        var localFiles = files.ToArray();
        var tcs = new TaskCompletionSource<object>();
        var clients = new List<WebClient>();

        cancellationToken.Register(
            () =>
            {
                // Break point here
                foreach (var wc in clients.Where(x => x != null))
                    wc.CancelAsync();
            });

        var syncRoot = new object();
        var count = 0;
        foreach (var file in localFiles)
        {
            var client = new WebClient();

            client.DownloadDataCompleted += (s, args) =>
            {
                // Break point here
                if (args.Cancelled)
                    tcs.TrySetCanceled();
                else if (args.Error != null)
                    tcs.TrySetException(args.Error);
                else
                {
                    var stream = (Stream)args.UserState;
                    stream.Write(args.Result, 0, args.Result.Length);
                    lock (syncRoot)
                    {
                        count++;
                        if (count == localFiles.Length)
                            tcs.TrySetResult(null);
                    }
                }
            };
            clients.Add(client);

            client.DownloadDataAsync(file.Item1, file.Item2);
        }

        return tcs.Task;
    }
}

And when I am calling DownloadFilesAsync in LINQPad in isolation, DownloadDataCompleted is called after half a second or so, as expected.

However, in my real application, it simply doesn't fire and the code that waits for it to finish simply is stuck. I have two break points as indicated by the comments. None of them is hit.
Ah, but sometimes, it does fire. Same URL, same code, just a new debug session. No pattern at all.

I checked the available threads from the thread pool: workerThreads > 30k, completionPortThreads = 999.

I added a sleep of 10 seconds before the return and checked after the sleep that my web clients haven't been garbage collected and that my event handler is still attached.

Now, I ran out of ideas to trouble shoot this.
What else could cause this strange behaviour?

Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • I was going to say that with the code above, `clients` will go out of scope and likely be hit by the GC. However you pointed out that when you sleep in the method, `clients` is obviously still around. If you add the sleep, does the download work though? Or you still get the same behaviour? Also how long would you expect all the downloads to take? – steve cook Sep 01 '14 at 14:22
  • do you get the same behavior when you start your project in VS without the debugger attached? – Eren Ersönmez Sep 01 '14 at 14:23
  • @steve: Adding the sleep doesn't change the behavior, the download still doesn't work. The download takes less than a second and the URL is for localhost, so connection problems are out of the question, too. And `clients` doesn't go out of scope because of the action passed to `cancellationToken.Register`. – Daniel Hilgarth Sep 01 '14 at 14:24
  • @ErenErsönmez: Yes. That's how I noticed that something was amiss - the application simply hung. – Daniel Hilgarth Sep 01 '14 at 14:24
  • Hoping you don't call as `DownloadFilesAsync(......).Wait()` – L.B Sep 01 '14 at 14:29
  • @L.B Somewhere later, there is a `Task.WaitAll` which waits for this and other tasks. However, (1) I fail to see why this would affect the async download - please elaborate - and (2) the problem doesn't disappear, when I add the sleep and as such, `Task.WaitAll` will not be called. – Daniel Hilgarth Sep 01 '14 at 14:32
  • @DanielHilgarth Try to call it with await (`await Task.WhenAll`) without blocking the calling thread (I suspect from a deadlock). – L.B Sep 01 '14 at 14:33
  • Did you try moving this to a small complete example? Something else in your code is doing someting wacky? Like you said, if the code above works fine in another environment it could point to something elsewhere.. – steve cook Sep 01 '14 at 14:33
  • @steve: No I didn't as I didn't had the time yet that would be necessary to do this. – Daniel Hilgarth Sep 01 '14 at 14:35
  • @DanielHilgarth What kind of application is your production app? – Yuval Itzchakov Sep 01 '14 at 14:40
  • @DanielHilgarth A small sample showing a blocking call of an async method can cause a deadlock `var html = new WebClient().DownloadStringTaskAsync("http://stackoverflow.com").Result;` (Winforms app) – L.B Sep 01 '14 at 14:40
  • @L.B: Thanks for this info. I wasn't aware of this, but I am unsure if that is actually the problem here: I am not using `DownloadStringTaskAsync` or `DownloadDataTaskAsync`. I am using the variant that doesn't return a task. The task that I return from my method is TaskCompletionSource task that is not associated with the WebClient. – Daniel Hilgarth Sep 01 '14 at 14:58
  • 1
    @DanielHilgarth if `DownloadDataCompleted` runs in calling thread's sync-context then you have the same problem. – L.B Sep 01 '14 at 14:59
  • @L.B ... yes, that might very well be the case. Let me check that. – Daniel Hilgarth Sep 01 '14 at 15:01
  • @DanielHilgarth I tested it and can reproduce your problem. It will not be an answer but can post it if you want. – L.B Sep 01 '14 at 15:02
  • @L.B: So why can't I reproduce it in LINQPad? I even have the Task.WaitAll in there. – Daniel Hilgarth Sep 01 '14 at 15:04
  • @DanielHilgarth Most possible you will also not be able to reproduce it in a console application. You need a Sync-Context like in winforms app. – L.B Sep 01 '14 at 15:04
  • @L.B: OK, that makes sense. So, what solution do you propose for .NET 4? no await, no `Task.WhenAll`. Should I emulate Task.WhenAll with a thread I start myself? I tried using Task.Factory.ContinueWhenAll but this didn't solve the problem. I verified that there is no Task.WaitAll left in my code. – Daniel Hilgarth Sep 01 '14 at 15:11
  • @DanielHilgarth Not use Async, Create a task and use sync methods :( – L.B Sep 01 '14 at 15:21
  • @L.B Not sure what you mean. The problem is that `DownloadDataCompleted` isn't called. `ContinueWith` on what? Please note, I am *not* using `DownloadDataTaskAsync`. I am using `DownloadDataAsync`. It returns `void`, not a `Task`. – Daniel Hilgarth Sep 01 '14 at 15:24
  • @L.B: Yeah, right now, my workaround is a thread that uses the sync methods :/ – Daniel Hilgarth Sep 01 '14 at 15:24

2 Answers2

1

From the comments:

Somewhere later, there is a Task.WaitAll which waits for this and other tasks. However, (1) I fail to see why this would affect the async download - please elaborate - and (2) the problem doesn't disappear, when I add the sleep and as such, Task.WaitAll will not be called

It seems that you have a deadlock caused by Task.WaitAll. I can explain it throughly here:

When you await an async method that returns a Task or a Task<T>, there is an implicit capture of the SynchronizationContext by the TaskAwaitable being generated by the Task.GetAwaiter method.

Once that sync context is in place and the async method call completes, the TaskAwaitable attempts to marshal the continuation (which is basically the rest of the method calls after the first await keyword) onto the SynchronizationContext (using SynchronizationContext.Post) which was previously captured. If the calling thread is blocked, waiting on that same method to finish, you have a deadlock.

When you call Task.WaitAll, you block until all Tasks are complete, this will make marshaling back to the original context impossible, and basically deadlock.

Instead of using Task.WaitAll, use await Task.WhenAll.

Community
  • 1
  • 1
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
  • 1
    the OP notes that even `WebClient.DownloadDataCompleted` doesn't fire, which runs on the thread pool and no `await` involved. – Eren Ersönmez Sep 01 '14 at 14:55
  • @ErenErsönmez Exactly. I am not using the methods from WebClient that return a task! – Daniel Hilgarth Sep 01 '14 at 14:59
  • But you are awaiting on the `Task` returned from the `tcs` – Yuval Itzchakov Sep 01 '14 at 16:16
  • @DanielHilgarth I know you're not using the overload that returns a `Task`. But your `OnCompleted` event is probably triggering on the same synchronizationcontext and causing that deadlock. You said you're on .NET 4.0, i suggest you install [Microsoft.Bcl.Async](https://www.nuget.org/packages/Microsoft.Bcl.Async) to get the `async-await` feature. – Yuval Itzchakov Sep 01 '14 at 18:08
1

Based on the comments, not an ideal answer but you can temporarily change the sync context before and after the foreach:

var syncContext = SynchronizationContext.Current;
SynchronizationContext.SetSynchronizationContext(null);

foreach (var file in localFiles)
{
    ...
}

SynchronizationContext.SetSynchronizationContext(syncContext);
Eren Ersönmez
  • 38,383
  • 7
  • 71
  • 92