0

I have a question referencing the usage of concurrently running tasks in Azure Functions, on the consumption plan.

One part of our application allows users to connect their mail accounts, then downloads messages every 15 minutes. We have azure function to do so, one for all users. The thing is, as users count increases, the function need's more time to execute.

In order to mitigate a timeout case, I've changed our function logic. You can find some code below. Now it creates a separate task for each user and then waits for all of them to finish. There is also some exception handling implemented, but that's not the topic for today.

The problem is, that when I check some logs, I see executions as the functions weren't executed simultaneously, but rather one after one. Now I wonder if I made some mistake in my code, or is it a thing with azure functions that they cannot run in such a scenario (I haven't found anything suggesting it on the Microsoft sites, quite the opposite actually)

PS - I do know about durable functions, however, for some reason I'd like to resolve this issue without them.

My code:

List<Task<List<MailMessage>>> tasks = new List<Task<List<MailMessage>>>();
            foreach (var account in accounts)
            {
                using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromMinutes(6)))
                {
                    try
                    {
                        tasks.Add(GetMailsForUser(account, cancellationTokenSource.Token, log));
                    }
                    catch (TaskCanceledException)
                    {
                        log.LogInformation("Task was cancelled");
                   }
                }
            }
            
            try
            {
                await Task.WhenAll(tasks.ToArray());
            }
            catch(AggregateException aex)
            {
                aex.Handle(ex =>
                {
                    TaskCanceledException tcex = ex as TaskCanceledException;
                    if (tcex != null)
                    {
                        log.LogInformation("Handling cancellation of task {0}", tcex.Task.Id);
                        return true;
                    }
                    return false;
                });
            }

            log.LogInformation($"Zakończono pobieranie wiadomości.");
private async Task<List<MailMessage>> GetMailsForUser(MailAccount account, CancellationToken cancellationToken, ILogger log)
    {
        log.LogInformation($"[{account.UserID}] Rozpoczęto pobieranie danych dla konta {account.EmailAddress}");

        IEnumerable<MailMessage> mails;

        try
        {
            using (var client = _mailClientFactory.GetIncomingMailClient(account))
            {
                mails = client.GetNewest(false);
            }
            log.LogInformation($"[{account.UserID}] Pobrano {mails.Count()} wiadomości dla konta {account.EmailAddress}.");
            return mails.ToList();
        }
        catch (Exception ex)
        {
            log.LogWarning($"[{account.UserID}] Nie udało się pobrać wiadomości dla konta {account.EmailAddress}");
            log.LogError($"[{account.UserID}] {ex.Message} {ex.StackTrace}");
            return new List<MailMessage>();
        }
    }

Output:

Logs:

Kashyap
  • 15,354
  • 13
  • 64
  • 103
  • 2
    I would guess that _mailClientFactory.GetIncomingMailClient is returning the same mail client, which either has a lock in it, or your mail server only allows 1 request per connection (ahd there is one connection) – Neil May 09 '21 at 16:26
  • Interesting thougth. Hovewer, each account is different and possibly hosted on different server. MailAccount model contains information about the server, ports, protocols etc. So each time it is a different connection. – Maciek Nawrocki May 09 '21 at 16:29
  • 1
    What's more interesting, I have 3 other functions remade with logic like this, and all of them seem to have face this problem. – Maciek Nawrocki May 09 '21 at 16:30
  • 1
    So the problem is probably not with the code you have posted then :-( – Neil May 09 '21 at 16:32
  • I think it may be with the Azure itself, however this part of code is basically the same on each function, so if there is a problem here, it also appears in other functions. Do you see any misconception in the code I've posted? Does it seem proper and have a reasonable logic? – Maciek Nawrocki May 09 '21 at 16:36
  • 2
    Actually, although GetMailsForUser has the `async` modifier, I can't see any `async` functions or any `await`s. Is that just a C+P error, or is it correct? – Neil May 09 '21 at 16:38
  • Honestly, I do not know how to do it other way, in order to get Task from a function and add it to list, without using the async modifier on it itself. If I remove it, the returned type is wrong. I changed the function, adding Thread.Sleep and commenting rest. The effect is same: https://imgur.com/a/zsVf3I5 – Maciek Nawrocki May 09 '21 at 16:51
  • 1
    But none of your code is async! Just returning a task doesn't make it async. – Neil May 09 '21 at 17:52
  • Would you like to send me some materials, links, etc. on how to make it so it runs those functions in parallel? I think that my understanding of async functions may not be correct :) Would that help? https://imgur.com/a/dce1xtv – Maciek Nawrocki May 09 '21 at 17:59
  • 1
    The code in that imgur looks correct, it uses await GetNewestAsync . – Neil May 09 '21 at 18:02
  • Ok, I've hanged one line to this tasks.Add(Task.Run(() => GetMailsForUser(account, cancellationTokenSource.Token, log))); and now it works :) Thank your very much for help Neil! :) – Maciek Nawrocki May 09 '21 at 18:03
  • And now I can also remove the async modifier from the second function, since it doesn't need it – Maciek Nawrocki May 09 '21 at 18:09

3 Answers3

0

Azure functions in a consumption plan scales out automatically. Problem is that the load needs to be high enough to trigger the scale out.

What is probably happening is that the scaling is not being triggered, therefore everything runs on the same instance, therefore the calls run sequentially.

There is a discussion on this with some code to test it here: https://learn.microsoft.com/en-us/answers/questions/51368/http-triggered-azure-function-not-scaling-to-extra.html

Shiraz Bhaiji
  • 64,065
  • 34
  • 143
  • 252
0

The compiler will give you a warning for GetMailsForUser:

CS1998: This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(…)' to do CPU-bound work on a background thread.

It's telling you it will run synchronously, which is the behaviour you're seeing. In the warning message there's a couple of recommendations:

  1. Use await. This would be the most ideal solution, since it will reduce the resources your Azure Function uses. However, this means your _mailClientFactory will need to support asynchronous APIs, which may be too much work to take on right now (many SMTP libraries still do not support async).
  2. Use thread pool threads. Task.Run is one option, or you could use PLINQ or Parallel. This solution will consume one thread per account, and you'll eventually hit scaling issues there.
Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
0
  1. If you want to identify which Task is running in which Function instance etc. use invocation id ctx.InvocationId.ToString(). May be prefix all your logs with this id.

  2. Your code isn't written such that it can be run in parallel by the runtime. See this: Executing tasks in parallel

  3. You can also get more info about the trigger using trigger meta-data. Depends on trigger. This is just to get more insight into what function is handling what message etc.

Kashyap
  • 15,354
  • 13
  • 64
  • 103