1

I'm using the following logic that seems to run in parallel when running locally, but when deployed on azure functions, it's running sequencially:

var allRecordTasks = new List<Task<DataRecord>>();
for (int i = 0; i < 100; i++)
{
    allRecordTasks.Add(Task.Run(() => MyTask(cancellationToken)));
}

await Task.WhenAll(allRecordTasks);

I'm running under S1 plan and I was under the assumption that a single core could run multiple threads.

Is there some setting to make this work, is it possible when running a plan with multiple cores or is it simply not possible without using durable functions?

private async Task<DataRecord> MyTask(CancellationToken cancellationToken)
        {
            var timeSeriesType = await GetTimeSeriesTypeAsync();
            var dataRecord = new DataRecord(timeSeriesType);
            return dataRecord;
        }

Update: Simply using

allRecordTasks.Add(MyTask(cancellationToken));

ran in parallel. Other issues in my code caused the CPU core to be busy, which didn't cost much locally (quad-core), but prevented performance on a single core. Thanks Peter Bons and Stephen Cleary for clearing things up, pointing me in the right direction.

Norbert Huurnink
  • 1,326
  • 10
  • 18
  • How did you found out it is running sequencially? What does `MyTask` do? – Peter Bons Nov 22 '22 at 08:54
  • MyTask retrieves data from azure table storage. There is a big performance gain when running locally if I run this in parallel, there is no gain when running in azure functions. – Norbert Huurnink Nov 22 '22 at 08:56
  • *[..] multiple threads.[..]* if it is awaiting multiple I/O calls to storage accounts no threads are involved. Can you post the code of `MyTask`? You shouldn't use `Task.Run` if `MyTask` is task based. – Peter Bons Nov 22 '22 at 09:33
  • I've updated the post with a simplified version of MyTask. GetTimeSeriesTypeAsync() runs some buildup code for the query and calls TableClient.ExecuteQuery. When using simply tasks.Add(MyTask()), localhost also didn't run in parallel. That's why I started to use Task.Run() – Norbert Huurnink Nov 22 '22 at 09:38
  • I'm not using TableClient.QueryAsync, but TableClient.Query (non-async). Could that be the reason? I would expect that Task.Run() would just run on seperate threads, no matter what code it runs. – Norbert Huurnink Nov 22 '22 at 09:44
  • It is best practices to alway use Async methods when provided for IO bound work and avoid using Task.Run for that. – Peter Bons Nov 22 '22 at 11:54
  • I understand, but async/await didn't create parallelism for me, and that's what I was looking for, and Task.run did (locally). I'm not sure what the best approach would be and if I can use multi threading on my azure function app plan. – Norbert Huurnink Nov 22 '22 at 14:07

2 Answers2

2

I'm running under S1 plan and I was under the assumption that a single core could run multiple threads.

Well, kinda. Any core can "run" any number of threads. But of course each core is only one core and only executes one CPU instruction at a time. So if you're talking about threads doing CPU work, then it would only be one at a time.

(Most likely, the CPU is actually switching between the tasks periodically, but the overall time will be essentially the same as if it just ran them sequentially).

Is there some setting to make this work, is it possible when running a plan with multiple cores or is it simply not possible without using durable functions?

It should parallelize nicely with multiple cores.

Pro tip: You can use Process Explorer to set the Processor Affinity on your locally-running instance to simulate one (or two, or ...) cores.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Thanks, that tool is actually quite useful. I've found an issue in my code that kept that single core busy. Just adding the tasks (without Task.Run) actually runs in parallel nicely. Since I'm querying azure table storage in the tasks and not using much CPU, it's very beneficial to await these queries in parallel. Thanks! – Norbert Huurnink Nov 24 '22 at 08:48
0

async await is not the same thing as multithreading. They are somewhat related, but putting things in tasks in no way constitutes multithreading.

Use Parallel Tasks instead, see for example the 2nd answer to this question which is similar to yours.

Specifically for Azure Functions I would not implement parallelism like this, since the ecosystem offers proper way of doing that by using queues. So you have one function with your current trigger that retrieves the list of DataRecords, then puts each one on a queue, then a 2nd function with a QueueTrigger that handles items on the queue. The 2nd Function will then execute several times in parallel as long as the app service plan allows. Note I'm not talking about 2 separate function apps but 2 methods with different triggers.

Using custom multithreading or parallelism could also cause issues on dynamic hosting plans (Y1) where the function execution could be halted as soon as the function returns a result.

Simmetric
  • 1,443
  • 2
  • 12
  • 20