4

I have a question regarding Task.WaitAll. At first I tried to use async/await to get something like this:

private async Task ReadImagesAsync(string HTMLtag)
{
    await Task.Run(() =>
    {
        ReadImages(HTMLtag);
    });
}

Content of this function doesn't matter, it works synchronously and is completely independent from outside world.

I use it like this:

private void Execute()
{
    string tags = ConfigurationManager.AppSettings["HTMLTags"];

    var cursor = Mouse.OverrideCursor;
    Mouse.OverrideCursor = System.Windows.Input.Cursors.Wait;
    List<Task> tasks = new List<Task>();
    foreach (string tag in tags.Split(';'))
    {
         tasks.Add(ReadImagesAsync(tag));
         //tasks.Add(Task.Run(() => ReadImages(tag)));
    }

    Task.WaitAll(tasks.ToArray());
    Mouse.OverrideCursor = cursor;
}

Unfortunately I get deadlock on Task.WaitAll if I use it that way (with async/await). My functions do their jobs (so they are executed properly), but Task.WaitAll just stays here forever because apparently ReadImagesAsync doesn't return to the caller.

The commented line is approach that actually works correctly. If I comment the tasks.Add(ReadImagesAsync(tag)); line and use tasks.Add(Task.Run(() => ReadImages(tag))); - everything works well.

What am I missing here?

ReadImages method looks like that:

private void ReadImages (string HTMLtag)
{
    string section = HTMLtag.Split(':')[0];
    string tag = HTMLtag.Split(':')[1];

    List<string> UsedAdresses = new List<string>();
    var webClient = new WebClient();
    string page = webClient.DownloadString(Link);

    var siteParsed = Link.Split('/');

    string site = $"{siteParsed[0]} + // + {siteParsed[1]} + {siteParsed[2]}";

    int.TryParse(MinHeight, out int minHeight);
    int.TryParse(MinWidth, out int minWidth);

    int index = 0;

    while (index < page.Length)
    {
        int startSection = page.IndexOf("<" + section, index);
        if (startSection < 0)
            break;

        int endSection = page.IndexOf(">", startSection) + 1;
        index = endSection;

        string imgSection = page.Substring(startSection, endSection - startSection);

        int imgLinkStart = imgSection.IndexOf(tag + "=\"") + tag.Length + 2;
        if (imgLinkStart < 0 || imgLinkStart > imgSection.Length)
            continue;

        int imgLinkEnd = imgSection.IndexOf("\"", imgLinkStart);
        if (imgLinkEnd < 0)
            continue;

        string imgAdress = imgSection.Substring(imgLinkStart, imgLinkEnd - imgLinkStart);

        string format = null;
        foreach (var imgFormat in ConfigurationManager.AppSettings["ImgFormats"].Split(';'))
        {
            if (imgAdress.IndexOf(imgFormat) > 0)
            {
                format = imgFormat;
                break;
            }
        }

        // not an image
        if (format == null)
            continue;

        // some internal resource, but we can try to get it anyways
        if (!imgAdress.StartsWith("http"))
            imgAdress = site + imgAdress;

        string imgName = imgAdress.Split('/').Last();

        if (!UsedAdresses.Contains(imgAdress))
        {
            try
            {
                Bitmap pic = new Bitmap(webClient.OpenRead(imgAdress));
                if (pic.Width > minHeight && pic.Height > minWidth)
                    webClient.DownloadFile(imgAdress, SaveAdress + "\\" + imgName);
            }
            catch { }
            finally
            {
                UsedAdresses.Add(imgAdress);
            }
        }

    }
}
Khaine
  • 295
  • 5
  • 18
  • It's hard to debug this without seeing ReadImages(), but consider a step-through debug to get an overview of what's happening. Visual Studio does this pretty well even with multi-threading. – Viggo Lundén Aug 11 '18 at 16:09
  • you probably shouldn't block on asynchronous code. See [here](https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html) for more info. What would happen if you change `Task.WaitAll` to `await Task.WhenAll` instead? – default Aug 11 '18 at 16:10
  • 1
    Sidenote: [dont wrap synchronous methods](https://blogs.msdn.microsoft.com/pfxteam/2012/03/24/should-i-expose-asynchronous-wrappers-for-synchronous-methods/) – default Aug 11 '18 at 16:10
  • Is ReadImage perhaps throwing an exception somewhere? – default Aug 11 '18 at 16:12
  • "It's hard to debug this without seeing ReadImages()" These functions work and do all of their jobs, because I can see effects on my disk (it downloads images from the site). So synchronous ReadImages() is certainly not a problem. BUT it is a void function, maybe that's the problem? "await Task.WhenAll" It wouldn't be the same, because this Execute function is not supposed to be async. And I would have to make it async, so I can use "await" keyword. – Khaine Aug 11 '18 at 16:15
  • 1
    This might also be related: https://stackoverflow.com/questions/38642798/mixing-async-await-with-result – default Aug 11 '18 at 16:27
  • If `ReadImages` is doing I/O work, why isn't it asynchronous to begin with? async and await are mostly for I/O work – Camilo Terevinto Aug 11 '18 at 16:35
  • 1
    You should also add what framework this is running on (ASP.NET, WinForms, WPF, Console application...) so we know what synchronization context (if any) is used – Camilo Terevinto Aug 11 '18 at 16:36
  • I use WPF. I also checked and this method works well too: `Parallel.ForEach(parsedTags, currentTag => ReadImages(currentTag));` Also making ReadImages fully async, even in primitive way works too. By primitive way I mean just making it async and `await Task.CompletedTask;` at the end. – Khaine Aug 11 '18 at 16:41
  • Can you [edit] your question and include `ReadImages`? We can help you make it actually asynchronous so you don't need a bad wrapper – Camilo Terevinto Aug 11 '18 at 16:51
  • I added ReadImages, but I don't think it's related to the problem in any way other than being synchronous method. After making it primitive async method (whole body is still synchronous, but it has `await Task.CompletedTask;` at the end) it started to work. So wrapping this method with `Task.Run` was probably source of the problem. – Khaine Aug 11 '18 at 17:08

4 Answers4

7

You are synchronously waiting for tasks to finish. This is not gonna work for WPF without a little bit of ConfigureAwait(false) magic. Here is a better solution:

private async Task Execute()
{
    string tags = ConfigurationManager.AppSettings["HTMLTags"];

    var cursor = Mouse.OverrideCursor;
    Mouse.OverrideCursor = System.Windows.Input.Cursors.Wait;
    List<Task> tasks = new List<Task>();
    foreach (string tag in tags.Split(';'))
    {
         tasks.Add(ReadImagesAsync(tag));
         //tasks.Add(Task.Run(() => ReadImages(tag)));
    }

    await Task.WhenAll(tasks.ToArray());
    Mouse.OverrideCursor = cursor;
}

If this is WPF, then I'm sure you would call it when some kind of event happens. The way you should call this method is from event handler, e.g.:

private async void OnWindowOpened(object sender, EventArgs args)
{
    await Execute();
}

Looking at the edited version of your question I can see that in fact you can make it all very nice and pretty by using async version of DownloadStringAsync:

private async Task ReadImages (string HTMLtag)
{
    string section = HTMLtag.Split(':')[0];
    string tag = HTMLtag.Split(':')[1];

    List<string> UsedAdresses = new List<string>();
    var webClient = new WebClient();
    string page = await webClient.DownloadStringAsync(Link);

    //...
}

Now, what's the deal with tasks.Add(Task.Run(() => ReadImages(tag)));?

This requires knowledge of SynchronizationContext. When you create a task, you copy the state of thread that scheduled the task, so you can come back to it when you are finished with await. When you call method without Task.Run, you say "I want to come back to UI thread". This is not possible, because UI thread is already waiting for the task and so they are both waiting for themselves. When you add another task to the mix, you are saying: "UI thread must schedule an 'outer' task that will schedule another, 'inner' task, that I will come back to."

FCin
  • 3,804
  • 4
  • 20
  • 49
  • `tasks.Add(ReadImagesAsync(tag));` Yes, with asynchronous method it works like that. Even with WaitAll. I didn't have to make Execute async. – Khaine Aug 11 '18 at 17:02
  • @Khaine I updated my answer, so you can see how you can utilize async to make it asynchronous all the way. This is prefered way of writing async code "all the way". – FCin Aug 11 '18 at 17:04
  • There is also async version of `WebClient.DownloadFile` But `DownloadStringAsync` is void. It says I cannot await for it. Same thing with `WebClient.DownloadFileAsync` That's why I left them in that state. Also this: `tasks.Add(Task.Run(() => ReadImages(tag)));` together with WaitAll actually works. It was only problematic with `Task.Run` wrapping. – Khaine Aug 11 '18 at 17:12
  • @Khaine You are right. Sorry, I didn't check the method declaration. Yes, adding `Task.Run` and keeping `Task.WaitAll` will work, but it is not a good way to write async code. – FCin Aug 11 '18 at 17:19
1

Use WhenAll instead of WaitAll, Turn your Execute into async Task and await the task returned by Task.WhenAll.

This way it never blocks on an asynchronous code.

Bizhan
  • 16,157
  • 9
  • 63
  • 101
V0ldek
  • 9,623
  • 1
  • 26
  • 57
1

I found some more detailed articles explaining why actually deadlock happened here:

https://medium.com/bynder-tech/c-why-you-should-use-configureawait-false-in-your-library-code-d7837dce3d7f

https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html

Short answer would be making a small change in my async method so it looks like that:

private async Task ReadImagesAsync(string HTMLtag)
{
    await Task.Run(() =>
    {
        ReadImages(HTMLtag);
    }).ConfigureAwait(false);
}

Yup. That's it. Suddenly it doesn't deadlock. But these two articles + @FCin response explain WHY it actually happened.

Khaine
  • 295
  • 5
  • 18
-1

It's like you are saying i do not care when ReadImagesAsync() finishes but you have to wait for it .... Here is a definition

The Task.WaitAll blocks the current thread until all other tasks have completed execution.

The Task.WhenAll method is used to create a task that will complete if and only if all the other tasks are complete.

So, if you are using Task.WhenAll you would get a task object that isn't complete. However, it will not block and would allow the program to execute. On the contrary, the Task.WaitAll method call actually blocks and waits for all other tasks to complete.

Essentially, Task.WhenAll would provide you a task that isn't complete but you can use ContinueWith as soon as the specified tasks have completed their execution. Note that neither the Task.WhenAll method nor the Task.WaitAll would run the tasks, i.e., no tasks are started by any of these methods.

Task.WhenAll(taskList).ContinueWith(t => {

  // write your code here

});
Community
  • 1
  • 1
Sadbrute
  • 30
  • 7
  • referencing an external link is frowned upon in SO. If the link has useful information you can write here, otherwise you can leave it as a comment. – Bizhan Aug 11 '18 at 16:27