328

I need to modify an existing program and it contains following code:

var inputs = events.Select(async ev => await ProcessEventAsync(ev))
                   .Select(t => t.Result)
                   .Where(i => i != null)
                   .ToList();

But this seems very weird to me, first of all the use of async and awaitin the select. According to this answer by Stephen Cleary I should be able to drop those.

Then the second Select which selects the result. Doesn't this mean the task isn't async at all and is performed synchronously (so much effort for nothing), or will the task be performed asynchronously and when it's done the rest of the query is executed?

Should I write the above code like following according to another answer by Stephen Cleary:

var tasks = await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev)));
var inputs = tasks.Where(result => result != null).ToList();

and is it completely the same like this?

var inputs = (await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev))))
                                       .Where(result => result != null).ToList();

While i'm working on this project I'd like to change the first code sample but I'm not too keen on changing (apparantly working) async code. Maybe I'm just worrying for nothing and all 3 code samples do exactly the same thing?

ProcessEventsAsync looks like this:

async Task<InputResult> ProcessEventAsync(InputEvent ev) {...}
Community
  • 1
  • 1
Alexander Derck
  • 13,818
  • 5
  • 54
  • 76
  • What's the return type of ProceesEventAsync? – tede24 Jan 26 '16 at 10:33
  • @tede24 It's `Task` with `InputResult` being a custom class. – Alexander Derck Jan 26 '16 at 10:37
  • Your versions are much easier to read in my opinion. However, you have forgotten to `Select` the results from the tasks before your `Where`. – Max Jan 26 '16 at 10:38
  • And InputResult has a Result property right? – tede24 Jan 26 '16 at 10:40
  • @tede24 Result is property of task not my class. And @Max the await should make sure I get the results without accessing `Result` property of task – Alexander Derck Jan 26 '16 at 10:43
  • Yes. Sorry. I was just updating my comment but it got to stale to edit in the process. I found the name `tasks` slightly misleading, fyi. It makes me expect an enumerable of task. – Max Jan 26 '16 at 10:52
  • 1
    There is also a way for a lazy developer to make this code async. Just add `ToList()` to create all tasks before waiting for results like so `events.Select(async ev => await ProcessEventAsync(ev)).ToList().Select(t => t.Result)...`. This has a slight performance impact compared to `WaitAll()` but is negligible in most cases. – Poma Aug 29 '16 at 08:26
  • @Poma Interesting, going to test that out. When you write it like that I find it harder to see what the code is supposed to be doing though, the example from the accepted answer is more clear to me. – Alexander Derck Aug 29 '16 at 08:38
  • `ToList()` walks through all elements of collection and forces immediate query evaluation before passing control further. Thus launching all async tasks before `.Select(t => t.Result)` starts evaluating – Poma Aug 29 '16 at 14:13
  • Well, having this against a database is troublesome : I ended up having all my connection pool full because of a connection was created for almost every elements. – Larry Nov 04 '19 at 16:00
  • The original question would be easier to read if the terms "parallel" and "sequentially" were used instead of "asynchronous" and "synchronous". – Alex Dresko Jan 08 '20 at 19:56

8 Answers8

338
var inputs = events.Select(async ev => await ProcessEventAsync(ev))
                   .Select(t => t.Result)
                   .Where(i => i != null)
                   .ToList();

But this seems very weird to me, first of all the use of async and await in the select. According to this answer by Stephen Cleary I should be able to drop those.

The call to Select is valid. These two lines are essentially identical:

events.Select(async ev => await ProcessEventAsync(ev))
events.Select(ev => ProcessEventAsync(ev))

(There's a minor difference regarding how a synchronous exception would be thrown from ProcessEventAsync, but in the context of this code it doesn't matter at all.)

Then the second Select which selects the result. Doesn't this mean the task isn't async at all and is performed synchronously (so much effort for nothing), or will the task be performed asynchronously and when it's done the rest of the query is executed?

It means that the query is blocking. So it is not really asynchronous.

Breaking it down:

var inputs = events.Select(async ev => await ProcessEventAsync(ev))

will first start an asynchronous operation for each event. Then this line:

                   .Select(t => t.Result)

will wait for those operations to complete one at a time (first it waits for the first event's operation, then the next, then the next, etc).

This is the part I don't care for, because it blocks and also would wrap any exceptions in AggregateException.

and is it completely the same like this?

var tasks = await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev)));
var inputs = tasks.Where(result => result != null).ToList();

var inputs = (await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev))))
                                       .Where(result => result != null).ToList();

Yes, those two examples are equivalent. They both start all asynchronous operations (events.Select(...)), then asynchronously wait for all the operations to complete in any order (await Task.WhenAll(...)), then proceed with the rest of the work (Where...).

Both of these examples are different from the original code. The original code is blocking and will wrap exceptions in AggregateException.

Ehsan Sajjad
  • 61,834
  • 16
  • 105
  • 160
Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Cheers for clearing that up! So instead of the exceptions wrapped up into an `AggregateException` I would get multiple separate exceptions in the second code? – Alexander Derck Jan 26 '16 at 15:49
  • 1
    @AlexanderDerck: No, in both the old and new code, only the first exception would be raised. But with `Result` it would be wrapped in `AggregateException`. – Stephen Cleary Jan 26 '16 at 16:32
  • I'm getting a deadlock in my ASP.NET MVC Controller using this code. I solved it using Task.Run( … ). I don't have a good feeling about it. However, it finished just right when running into an async xUnit test. What's going on? – SuperJMN Mar 19 '19 at 11:42
  • @SuperJMN: I suspect you're [blocking on async code](https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html). – Stephen Cleary Mar 19 '19 at 15:39
  • @StephenCleary I have read your post briefly, but I can't seem to find a fix for it, since you state that using ConfigureAwait(false) is a hack :) My code currently looks like: `var models = dtos.Select(async x => new CuentaModel { CodigoCuenta = x.Codigo, DescripcionCuenta = x.Descripcion, ContratosList = await GetContratos(x.Id), }).Select(x => x.Result);` The problem here is that some properties in the initializer are calculated asynchronously :( – SuperJMN Mar 20 '19 at 07:56
  • 4
    @SuperJMN: Replace `stuff.Select(x => x.Result);` with `await Task.WhenAll(stuff)` – Stephen Cleary Mar 20 '19 at 16:05
  • @StephenCleary, could you elaborate why 2 lines below behave the same? wouldn't async/await syntax create a bunch of state machines that could be avoided if Task syntax was used? `events.Select(async ev => await ProcessEventAsync(ev)) events.Select(ev => ProcessEventAsync(ev))` – DanielS Oct 24 '19 at 19:03
  • 1
    @DanielS: They're *essentially* the same. There are some differences such as state machines, capturing context, behavior of synchronous exceptions. More info at https://blog.stephencleary.com/2016/12/eliding-async-await.html – Stephen Cleary Oct 24 '19 at 21:50
  • @StephenCleary is there a way initialize all the tasks and to not wait until all the tasks are completed, but to work with the results as soon as they are ready? – user33276346 Jun 01 '20 at 03:04
  • 1
    @user33276346: I usually recommend refactoring by adding an asynchronous method. E.g., if the original tasks represent "get" and you want to "process" as they complete, then make a new method that is "get-and-process" and call that one instead of "get" when starting all the tasks. – Stephen Cleary Jun 01 '20 at 13:02
48

I used this code:

public static async Task<IEnumerable<TResult>> SelectAsync<TSource,TResult>(
    this IEnumerable<TSource> source, Func<TSource, Task<TResult>> method)
{
  return await Task.WhenAll(source.Select(async s => await method(s)));
}

like this:

var result = await sourceEnumerable.SelectAsync(async s=>await someFunction(s,other params));

Edit:

Some people have raised the issue of concurrency, like when you are accessing a database and you can't run two tasks at the same time. So here is a more complex version that also allows for a specific concurrency level:

public static async Task<IEnumerable<TResult>> SelectAsync<TSource, TResult>(
    this IEnumerable<TSource> source, Func<TSource, Task<TResult>> method,
    int concurrency = int.MaxValue)
{
    var semaphore = new SemaphoreSlim(concurrency);
    try
    {
        return await Task.WhenAll(source.Select(async s =>
        {
            try
            {
                await semaphore.WaitAsync();
                return await method(s);
            }
            finally
            {
                semaphore.Release();
            }
        }));
    } finally
    {
        semaphore.Dispose();
    }
}

Without a parameter it behaves exactly as the simpler version above. With a parameter of 1 it will execute all tasks sequentially:

var result = await sourceEnumerable.SelectAsync(async s=>await someFunction(s,other params),1);

Note: Executing the tasks sequentially doesn't mean the execution will stop on error!

Just like with a larger value for concurrency or no parameter specified, all the tasks will be executed and if any of them fail, the resulting AggregateException will contain the thrown exceptions.

If you want to execute tasks one after the other and fail at the first one, try another solution, like the one suggested by xhafan (https://stackoverflow.com/a/64363463/379279)

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Siderite Zackwehdex
  • 6,293
  • 3
  • 30
  • 46
  • 8
    This just wraps the existing functionality in a more obscure way imo – Alexander Derck Aug 22 '18 at 09:58
  • The alternative is var result = await Task.WhenAll(sourceEnumerable.Select(async s=>await someFunction(s,other params)). It works, too, but it's not LINQy – Siderite Zackwehdex Aug 23 '18 at 18:39
  • Shouldn't `Func> method` contain the `other params` mentioned on the second bit of code? – matramos Sep 12 '18 at 11:23
  • 2
    The extra parameters are external, depending on the function that I want to execute, they are irrelevant in the context of the extension method. – Siderite Zackwehdex Sep 13 '18 at 12:05
  • 10
    That's a lovely extension method. Not sure why it was deemed "more obscure" - it's semantically analogous to the synchronous `Select()`, so is an elegant drop-in. – nullPainter Dec 02 '19 at 02:53
  • 6
    The `async` and `await` inside the first lambda is redundant. The SelectAsync method can simply be written as: `return await Task.WhenAll(source.Select(method));` – Nathan Apr 22 '20 at 19:43
  • 4
    Indeed @Nathan, why have the ```await``` at all? - ```public static Task SelectAsync(this IEnumerable source, Func> method) { return Task.WhenAll(source.Select(x => method(x))); }``` – Martin Connell Jul 21 '20 at 18:48
  • 1
    Beware that when this implementation of .SelectAsync() is used by a code using NHibernate session, it might produce `A command is already in progress` exception as it uses the NHibernate session concurrently. For this reason, I used this https://stackoverflow.com/a/64363463/379279 solution. – xhafan Apr 19 '21 at 16:57
  • As a side note, the `SelectAsync` method that includes a `concurrency` parameter is most commonly named [`ForEachAsync`](https://stackoverflow.com/questions/11564506/nesting-await-in-parallel-foreach). It can be implemented in various ways, differing usually on how they handle exceptions, and most probably a built-in implementation will be added soon ([`Parallel.ForEachAsync`](https://github.com/dotnet/runtime/issues/1946)). – Theodor Zoulias Apr 20 '21 at 08:23
  • @nrofis [eliding async/await](https://blog.stephencleary.com/2016/12/eliding-async-await.html) is not a neutral modification of the OP's code. It could result in leaked tasks, in case the `method` fails synchronously *sometimes*. The `using var semaphore` is OK though IMHO. – Theodor Zoulias Jun 06 '21 at 12:46
  • @TheodorZoulias I don't think this is the case here, but thanks for your awareness. I saw you've totally reverted my edit, the code formatting, and the `using`... – nrofis Jun 06 '21 at 13:29
  • @TheodorZoulias, from your link: "**Do** consider eliding when the method is just a passthrough or overload." This is the case here. This is the reason I removed them. For `Select`, especially in this simple implementation, you can elide the redundant `async`/`await` safely. – nrofis Jun 06 '21 at 13:44
  • Guys, fighting over eliding is pointless, as it is not in the scope of the question or my answer. I prefer not to "elide" because you can always add more code later without changing the method signature, but in this case it is not important to the way one uses the extension method. – Siderite Zackwehdex Jun 06 '21 at 13:51
  • No one fighting :) I was just wanted to format the code a little bit... Not a big deal... – nrofis Jun 06 '21 at 13:56
  • @nrofis "passthrough or overload" means that a method includes just a call to another method, and nothing else. But inside the `SelectAsync` lots of other things happening. If for example the first invocation of the `method` succeeds and the second fails, and you elid async/await inside the `Select` delegates, the `SelectAsync` will return a faulted `Task` without awaiting the completion of the task created by the first successful `method` invocation. This task will be a leaked fire-and-forget task. A library method is not supposed to do things like that. – Theodor Zoulias Jun 06 '21 at 14:49
  • @nrofis as for the `using` edit, sorry for revoking it. I was too lazy to do a selective revoke. Please consider redoing this edit. – Theodor Zoulias Jun 06 '21 at 14:49
43

Existing code is working, but is blocking the thread.

.Select(async ev => await ProcessEventAsync(ev))

creates a new Task for every event, but

.Select(t => t.Result)

blocks the thread waiting for each new task to end.

In the other hand your code produce the same result but keeps asynchronous.

Just one comment on your first code. This line

var tasks = await Task.WhenAll(events...

will produce a single Task<TResult[]> so the variable should be named in singular.

Finally your last code make the same but is more succinct.

For reference: Task.Wait / Task.WhenAll

Pang
  • 9,564
  • 146
  • 81
  • 122
tede24
  • 2,304
  • 11
  • 14
  • So the first code block is in fact executed synchronously? – Alexander Derck Jan 26 '16 at 11:46
  • 1
    Yes, because accessing Result produces a Wait which blocks the thread. In the other hand When produces a new Task you can await for. – tede24 Jan 26 '16 at 11:54
  • 1
    Coming back to this question and looking at your remark about the name of the `tasks` variable, you're completely right. Horrible choice, they're not even tasks as they get awaited right away. I'll just leave the question as is though – Alexander Derck Feb 22 '17 at 14:56
  • Just came by this thread. @AlexanderDerck - why not edit the answer? It gave me confusion for a while before getting to this answer. Also using var usually gets you to this point when it matters. – Kia Kaha May 01 '21 at 09:04
23

I prefer this as an extension method:

public static async Task<IEnumerable<T>> WhenAll<T>(this IEnumerable<Task<T>> tasks)
{
    return await Task.WhenAll(tasks);
}

So that it is usable with method chaining:

var inputs = await events
  .Select(async ev => await ProcessEventAsync(ev))
  .WhenAll()
Daryl
  • 3,253
  • 4
  • 29
  • 39
  • 1
    You shouldn't call the method `Wait` when it's not actually waiting. It's creating a task that is complete when all of the tasks are complete. Call it `WhenAll`, like the `Task` method it emulates. It's also pointless for the method to be `async`. Just call `WhenAll` and be done with it. – Servy May 10 '18 at 15:57
  • A bit of a useless wrapper in my opinion when it just calls the original method – Alexander Derck May 10 '18 at 20:40
  • @Servy fair point, but I don't particularly like any of the name options. WhenAll makes it sounds like an event which it is not quite. – Daryl May 11 '18 at 16:17
  • 4
    @AlexanderDerck the advantage is that you can use it in method chaining. – Daryl May 11 '18 at 16:17
  • 1
    @Servy, actually you can't remove the async and await from the extension method. You get this error: ``` Cannot implicitly convert type 'System.Threading.Tasks.Task' to 'System.Threading.Tasks.Task>' ``` Because it doesn't know Task is covariant with T – Daryl May 11 '18 at 16:37
  • 2
    @Daryl because `WhenAll` returns an evaluated list (it's not evaluated lazily), an argument can be made to use the `Task` return type to signify that. When awaited, this will still be able to use Linq, but also communicates that it is not lazy. – JAD Apr 15 '19 at 08:46
  • 2
    Extending @Daryl's good point further, we can reduce further to: ```public static Task WhenAll(this IEnumerable > tasks) { return Task.WhenAll(tasks); }``` – Martin Connell Jul 21 '20 at 18:46
22

I have the same problem as @KTCheek in that I need it to execute sequentially. However I figured I would try using IAsyncEnumerable (introduced in .NET Core 3) and await foreach (introduced in C# 8). Here's what I have come up with:

public static class IEnumerableExtensions {
    public static async IAsyncEnumerable<TResult> SelectAsync<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, Task<TResult>> selector) {
        foreach (var item in source) {
            yield return await selector(item);
        }
    }
}

public static class IAsyncEnumerableExtensions {
    public static async Task<List<TSource>> ToListAsync<TSource>(this IAsyncEnumerable<TSource> source) {
        var list = new List<TSource>();

        await foreach (var item in source) {
            list.Add(item);
        }

        return list;
    }
}

This can be consumed by saying:

var inputs = await events.SelectAsync(ev => ProcessEventAsync(ev)).ToListAsync();

Update: Alternatively you can add a reference to System.Linq.Async and then you can say:

var inputs = await events
    .ToAsyncEnumerable()
    .SelectAwait(async ev => await ProcessEventAsync(ev))
    .ToListAsync();
nfplee
  • 7,643
  • 12
  • 63
  • 124
  • 3
    These two operators are included in the [System.Linq.Async](https://www.nuget.org/packages/System.Linq.Async) package, with the names `SelectAwait` and `ToListAsync`, along with lots of other LINQ-style operators. – Theodor Zoulias Feb 23 '21 at 14:13
  • 1
    Actually no, your `SelectAsync` operates on `IEnumerable`s. The aforementioned `SelectAwait` operates on `IAsyncEnumerable`s. You would need to convert it first, by calling the `ToAsyncEnumerable` extension method. – Theodor Zoulias Feb 23 '21 at 14:29
  • 1
    Thanks @TheodorZoulias, I've updated my answer with the alternative solution. – nfplee Feb 24 '21 at 09:12
17

With current methods available in Linq it looks quite ugly:

var tasks = items.Select(
    async item => new
    {
        Item = item,
        IsValid = await IsValid(item)
    });
var tuples = await Task.WhenAll(tasks);
var validItems = tuples
    .Where(p => p.IsValid)
    .Select(p => p.Item)
    .ToList();

Hopefully following versions of .NET will come up with more elegant tooling to handle collections of tasks and tasks of collections.

johnny 5
  • 19,893
  • 50
  • 121
  • 195
Vitaliy Ulantikov
  • 10,157
  • 3
  • 61
  • 54
8

I wanted to call Select(...) but ensure it ran in sequence because running in parallel would cause some other concurrency problems, so I ended up with this. I cannot call .Result because it will block the UI thread.

public static class TaskExtensions
{
    public static async Task<IEnumerable<TResult>> SelectInSequenceAsync<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, Task<TResult>> asyncSelector)
    {
        var result = new List<TResult>();
        foreach (var s in source)
        {
            result.Add(await asyncSelector(s));
        }
        
        return result;
    }
}

Usage:

var inputs = events.SelectInSequenceAsync(ev => ProcessEventAsync(ev))
                   .Where(i => i != null)
                   .ToList();

I am aware that Task.WhenAll is the way to go when we can run in parallel.

KTCheek
  • 309
  • 2
  • 4
  • 2
    Upvoted. I would prefer a return type of `Task>` (or even better `Task`) instead of `Task>`. The later conveys the notion of [deferred execution](https://stackoverflow.com/questions/7324033/what-are-the-benefits-of-a-deferred-execution-in-linq), which is not applicable in this case. After the completion of the `Task`, the resulting `IEnumerable` is fully materialized, since it is based on a `List`. – Theodor Zoulias Apr 19 '21 at 17:05
2

"Just because you can doesn't mean you should."

You can probably use async/await in LINQ expressions such that it will behave exactly as you want it to, but will any other developer reading your code still understand its behavior and intent?

(In particular: Should the async operations be run in parallel or are they intentionally sequential? Did the original developer even think about it?)

This is also shown clearly by the question, which seems to have been asked by a developer trying to understand someone else's code, without knowing its intent. To make sure this does not happen again, it may be best to rewrite the LINQ expression as a loop statement, if possible.

Florian Winter
  • 4,750
  • 1
  • 44
  • 69