1

I have a question about async programming and Task.WhenAll(). I have a code snippet that downloads a folder from google drive and it works as expected when i am debugging my code. However when I run the application without debugger the download function takes atleast 4x the time it takes when it runs with the debugger. I also get crash logs with TaskCancelled exceptions (System.Threading.Tasks.TaskCanceledException: A task was canceled.) which do not happen with the debugger attached. What needs to be changed for the code to work as expected without debugger attached. NB this snippet downloads +- 1000 files in about 22-25 seconds with debugger and 2min+ without debugger.

public static async Task<bool> DownloadFolder(CloudDataModel.File file, string path, params string[] exclude)
    {
        try
        {
            if (file != null && !string.IsNullOrEmpty(file.id))
            {
                List<string> toExclude = new List<string>();
                if(exclude != null)
                {
                    toExclude = exclude.ToList();
                }
               
                List<Task> downloadFilesTask = new List<Task>();
                var files = await file.GetFiles();
                foreach (var f in files)
                {
                    var task = f.Download(path);
                    downloadFilesTask.Add(task);
                }
                
                var folders = await file.GetFoldersAsync();
                foreach (var folder in folders)
                {
                    if (toExclude.Contains(folder.name))
                    {
                        continue;
                    }
                    Task task = null;
                    if (path.Equals(Statics.ProjectFolderName))
                    {
                        task = DownloadFolder(folder, folder.name);
                    }
                    else
                    {
                        task = DownloadFolder(folder, Path.Combine(path, folder.name));
                    }
                  
                    downloadFilesTask.Add(task);
                }
                var array = downloadFilesTask.ToArray();
                await Task.WhenAll(array);
                return true;
            }                
        }
        catch (Exception e)
        {
            Crashes.TrackError(e);
        }
        return false;
    }

Edit

after some more trial and error the fault has been identified. the downloading of the file was the cause of the unexpected behaviour

public static async Task<StorageFile> DownloadFile(CloudDataModel.File file, string destinationFolder)
    {
        try
        {
            
            if (file != null)
            {
                Debug.WriteLine($"start download {file.name}");
                if (file.mimeType == Statics.GoogleDriveFolderMimeType)
                {
                    Debug.WriteLine($"did not download resource,  resource was folder instead of file. mimeType: {file.mimeType}");
                    return null;
                }

                var endpoint = Path.Combine(DownloadFileEndpoint, $"{file.id}?alt=media");
                // this would cause the unexpected behaviour
                HttpResponseMessage response = await client.GetAsync(endpoint);
               
                
         
                StorageFile downloadedFile;
                using (Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
                {
                    var memstream = new MemoryStream();
                    StreamReader reader = new StreamReader(streamToReadFrom);
                    streamToReadFrom.Position = 0;
                    await streamToReadFrom.CopyToAsync(memstream);
                    downloadedFile = await fileHandler.SaveDownloadedCloudFile(memstream, file, destinationFolder);
                    Debug.WriteLine($"download finished {file.name}");
                }

                return downloadedFile;
            }

            return null;
        }

        catch (Exception e)
        {
            Crashes.TrackError(e);
            return null;
        }
    }

After setting a timeout to the client (System.Net.Http.HttpClient) the code executed as expected.

client.Timeout = new TimeSpan(0,0,5);
Rick Bieze
  • 43
  • 7
  • 1
    Please post a minimal, reproducible example. I expect that during the construction of the minimal reproducible example, you will find that the problem causing this issue is somewhere completely different. – Stephen Cleary Oct 19 '22 at 21:00
  • 1
    Just a side note: Why are you converting your `downloadFileTask` to an array? That shouldn't be necessary when using `Task.WhenAll()` – Julian Oct 19 '22 at 21:01
  • 1
    I would first, remove all parallel operations, add logging and measure performance without any stuff happening in parallel. Do you have the difference in that configuration? Put back parallelism and pay attention to how things are happening: do all your downloads start roughly at the same time or there are bursts of activity occasionally; does download progress smoothly or again in bursts. Hit the documentation, does your server throttle you? – n0rd Oct 20 '22 at 00:28
  • @ewerspej i had used the list beforehand but i had read in other possible solutions List are not thread safe. it did not make any difference in my case tho – Rick Bieze Oct 20 '22 at 06:51
  • @RickBieze The thread safety of Lists doesn't make any difference in this scenario, AFAIK. I've always used Lists for this so far and never had any problems. Of course, this only works as long as the Tasks don't access the same objects (that's when thread safety would indeed play a role). `Task.WhenAll()` takes an `IEnumerable`, so it can be an Array but also a List and since you have a List already, just use that. – Julian Oct 20 '22 at 07:33
  • @StephenCleary thank you for your suggestion, i have found the error in the download function of the file (see my awnser for details) – Rick Bieze Oct 20 '22 at 09:03

2 Answers2

0

So after some trial and error the downloading of the file was identified as the cause of the problem.

var task = f.Download(path);

the Download function implements a System.Net.Http.HttpClient wich was initialized without any kind of timeout. when one of the tasks had a timeout, all other tasks would also not execute as expected. to fix this issue a timeout of 5 seconds was added to the HttpClient object.

       private static void InitializeClient()
       {
            HttpClientHandler handler = new HttpClientHandler { AllowAutoRedirect = true };
            client = new HttpClient(handler);
            client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", AccountManager.LoggedInAccount.Token);
            client.Timeout = new TimeSpan(0,0,5);
        }

After this change the code would execute as expected.

Rick Bieze
  • 43
  • 7
  • Be aware that the `HttpClient` class is intended to be instantiated [once](https://docs.microsoft.com/en-us/aspnet/web-api/overview/advanced/calling-a-web-api-from-a-net-client#create-and-initialize-httpclient), and reused throughout the life of an application. – Theodor Zoulias Oct 20 '22 at 09:37
-1

My guess is that without the debugger the parallelism increases, and so the remote server is overflowed by requests and can't perform optimally. Your code does not have any provision for limiting the degree of concurrency/parallelism. One possible way to solve this problem is to use a SemaphoreSlim, as shown in this question (among many others).

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • 2
    Is default connection [limit](https://learn.microsoft.com/en-us/dotnet/api/system.net.servicepointmanager.defaultconnectionlimit?view=net-7.0) still set to 2 nowadays? Could it be that it is different when running with debugger? – n0rd Oct 20 '22 at 00:24
  • @n0rd yep, that's a possibility too. – Theodor Zoulias Oct 20 '22 at 00:37
  • @n0rd i have tried setting it to int.MaxValue but it did not make any difference – Rick Bieze Oct 20 '22 at 06:52