1

I have a listOfFilesToDownload. I want to download all files in list in parallel

.........

Parallel.ForEach(listOfFilesToDownload, (file) =>
{
    SaveFile(file, myModel);
});

private static void SaveFile(string file, MyType myModel)
{
    filePath = "...";
    try
    {
        using (WebClient webClient = new WebClient())
        { 
            webClient.DownloadFileTaskAsync(file, filePath)                      
        }
        //some time consuming proccess with downloaded file 
    }
    catch (Exception ex)
    {   
    }
}

In SaveFile method I download the file, then I want to wait till it is downloaded, then make some processing with this file, and wait till this processing is finished. The full iteration have to be - download file and process it So, the questions are:

  1. how to wait till the file is downloaded in the best way, so nothing is blocked and with maximum performance (I mean if I would use just DownloadFile it will block the thread till the file downloading, and I think this is not so good)
  2. How to ensure that the file is downloaded and only then start processing it (cause if I start to process not existing file or not fully downloaded file I will have an error or wrong data)
  3. How to be sure processing with file is finished (because I tried to use webClient.DownloadFileCompleted event and process the file there, but I didn't manage to ensure that the processing is finished, example down below)

In complex the question is how to wait for a file to download asynchronously AND wait till it's processed

            using (WebClient webClient = new WebClient())
            {
                webClient.DownloadFileCompleted += DownloadFileCompleted(filePath, myModel);
                webClient.DownloadFileTaskAsync(file, filePath);
            }

DownloadFileCompleted returns AsyncCompletedEventHandler:

public static AsyncCompletedEventHandler DownloadFileCompleted(string filePath, MyType myModel)
{
    Action<object, AsyncCompletedEventArgs> action = (sender, e) =>
    {
    if (e.Error != null)
        return;
    //some time consuming proccess with downloaded file 
    };
    return new AsyncCompletedEventHandler(action);
}

Many thanks in advance!

Julissa DC
  • 251
  • 3
  • 14

2 Answers2

1

Have you considered Task.WhenAll? Something like:

var tasks = listOfFilesToDownload
    .AsParallel()
    .Select(f => SaveFile(f, myModel))
    .ToList();
await Task.WhenAll(tasks);

private static async Task SaveFile(string file, MyType myModel)
{
    filePath = "...";
    using (WebClient webClient = new WebClient())
    { 
        await webClient.DownloadFileTaskAsync(file, filePath);
        // process downloaded file
    }
}

The .AsParallel() call is helpful if you have CPU-bound work you're doing after downloading the file. Otherwise you're better off without it.

Lee Richardson
  • 8,331
  • 6
  • 42
  • 65
  • If you want to use `await` you're missing an `async` keyword on the method – Jamiec Feb 17 '21 at 14:41
  • Thanks a lot for the answer, it seems to work just fine! However, i just want to clarify, do you mean that if I use GPU bound processing I would better avoid .AsParallel()? – Viktor Husiev Feb 17 '21 at 14:52
  • No, if you do GPU or CPU operations after downloading then the AsParallel should help with performance. Just saying that it adds overhead if all you're doing is downloading and then saving to disk or inserting to a database or something. – Lee Richardson Feb 17 '21 at 15:06
0

As stated on this answer The whole idea behind Parallel.ForEach() is that you have a set of threads and each thread processes part of the collection so you can't await the saving part to finish. What you could do is to use Dataflow instead of Parallel.ForEach, which supports asynchronous Tasks well.

Like this:

var downloadTasks = listOfFilesToDownload.Select(file =>
  {
    SaveFile(file, myModel);
  });

var downloaded = await Task.WhenAll(customerTasks);

You await until all the files are saved.

Other answers on the same question might be useful to you.

Julissa DC
  • 251
  • 3
  • 14
  • Another useful question would be https://stackoverflow.com/questions/59659887/wait-till-the-last-file-is-downloaded?rq=1 in case the `Task.WhenAll` doesnt work for you – Julissa DC Feb 17 '21 at 14:43