2

Any official or stable 3rd party library that supports using AsParallel over an IAsyncEnumerable<T> (.NET Standard 2.1) ?

I don't want to wrap an IAsyncEnumerable<T> to an IEnumerable<Task<T>> with async methods, TaskCompletionSource or something else because of extra cost.

Alsein
  • 4,268
  • 1
  • 15
  • 35

1 Answers1

2

Combining parallel and async doesn't make much sense. The purpose of parallelism is about using more than one processors/cores simultaneously, while the purpose of asynchrony is about not using any CPU resources at all.

The AsParallel (PLINQ) is used with IEnumerables in cases when enumerating the IEnumerable is CPU-intensive. In other words when many-many CPU instructions have to be executed between the one MoveNext invocation and the next. With IAsyncEnumerables the delay is (normally) not caused by the invocation of the MoveNextAsync method itself, but by the awaiting of the returned ValueTask. Waiting an awaitable object consumes zero CPU resources. And you have no control about when it's going to complete. Take for example a Task.Delay(1000). It will complete not sooner than a second later, and you couldn't force it to complete in half a second unless you find a way to bend the spacetime somehow.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • Actually, I think parallel and async make a lot of scence together if your data source is any kind of asynchronous stream, like a database paged query, and the operation you are applying is i/o or cpu bound. Eg, you want to retrieve a long list of millions of uris, download the files, and do some processing: you can start downloading the first ones while you are still retrieving parts of the list, you want to download a handful at the same time to avoid losing time in i/o and you can't cache all the files before starting the processing... that's exactly what an async PLINQ would do – Jaime Silva Aug 31 '21 at 23:52
  • @JaimeSilva I agree. Multidimensional/heterogeneous workloads do exist, and cannot be processed efficiently with naive only-parallel or only-asynchronous approaches. Currently the best tool that we have available for solving this type of problems is the TPL Dataflow library, which feels a bit verbose and clumsy, compared to the elegant PLINQ queries for example. – Theodor Zoulias Sep 01 '21 at 00:55