Implement async "yielding" the proper way

Question

Async method runs sync on caller context/thread until its execution path runs into an I/O or similar task which has some waiting involved and then, instead of waiting, it returns to original caller, resuming its continuation later. The question is, what is the preferred way of implementing that "wait" method. How do the File/Network/etc async methods do it?

Lets assume I have a method which will have some waiting involved which is not covered by current IOs out of the box. I do not want to block calling thread and I do not want to force my caller to do a Task.Run() to offload me, I want a clean async/await pattern so that my callers can seamlessly integrate my library and I can run on its context until such time I need to yield. Lets for the sake of argument assume that I want to make a new IO lib which is not covered and I need a way to make all the glue that keeps async together.

Do I Task.Yield and continue? Do I have to do my own Task.Run/Task.Wait, etc? Both seem like more of the same abstractions (which brings the question how does Yield yield). I am curious, because there is a lot of talk about how async/await continuation works for the consumer and how all IO libs come already prepped, but there is very little about how the actual "breaking" point works and how process makers should implement it. How does the code at the end of a sync path actually release control and how the method operates at that point and after.

You might want to read [There is no thread](https://blog.stephencleary.com/2013/11/there-is-no-thread.html) and realise it's *async all the way down*. — Damien_The_Unbeliever, Oct 08 '21 at 09:36
Does this answer your question? [How do I implement an async I/O bound operation from scratch?](https://stackoverflow.com/questions/50985593/how-do-i-implement-an-async-i-o-bound-operation-from-scratch) — GSerg, Oct 08 '21 at 09:40
Does this answer your question? [Write your own async method](https://stackoverflow.com/q/24953808/11683) — GSerg, Oct 08 '21 at 09:45
Would be interesting to know what is that waiting not covered by current IO out of the box. But in general I think you'll have to just use native OS api anyway. — Evk, Oct 08 '21 at 09:45
On Windows I think they use `IOCP` for `IO` bound work (I think). I also think other asynchronous operations use task schedulers backed by the thread pool to efficiently schedule multiple tasks on a single thread. — WBuck, Oct 08 '21 at 09:47
Sorry, I should say, on Windows I think they use `IOCP` in conjunction with `OVERLAPPED` `IO`. — WBuck, Oct 08 '21 at 09:54
If you think about what an async operation is, what the CPU is doing is asking another device to do something on its behalf. So, for network bound work, the CPU will copy your outbound buffer into some known memory location (usually synchronously) and then ask the NIC to send that data across the wire. The NIC will then notify the CPU when the data has been sent (the CPU doesn't wait). This process specifically would use DMA for the transfer of data from main memory to the NICs internal buffer — WBuck, Oct 08 '21 at 10:05
@Evk, a custom hardware chip on an embedded platform. We are playing with potential software solutions for clients now that .net core runs properly on Linux. — mmix, Oct 08 '21 at 10:42
@GSerg, no, not really, this is still "high level" abstraction falling on lower async — mmix, Oct 08 '21 at 10:42
@Damien_The_Unbeliever, this is just a KoolAid article, whenever a work is being done ther is always a thread, but thats not the issue here, its about custom async signaling. And its not async all the way down, its async all the way down to BCL which covers 99.999% of scenarios. But, if you are 0.001%... — mmix, Oct 08 '21 at 10:42
If you are interested for a `ValueTask`-based example, you could check out [this](https://stackoverflow.com/questions/69147931/avoiding-allocations-and-maintaining-concurrency-when-wrapping-a-callback-based "Avoiding allocations and maintaining concurrency when wrapping a callback-based API with an async API on a hot path") question . — Theodor Zoulias, Oct 08 '21 at 11:06
No, there isn't a thread. If you read the article, you'd have seen that at the lowest level, it's about handing off to hardware and then letting a later interrupt cause resumption to happen. So if you're in a situation where you're going down to the hardware, you do the same. Only in the odd circumstances that hardware interrupts aren't an option do you need to do different. And that's such an "out there" requirement that you should already be aware that you need to run polling loops, etc. And if you're not directly dealing with hardware, you use your OS's abstraction, that tend to be async. — Damien_The_Unbeliever, Oct 09 '21 at 17:37
@Damien_The_Unbeliever lets agree to disagree on this. There is always a thread, be it application thread, kernel level IO polling or a thread responding to an interrupt from device, or for that matter a userland thread transitioning into Ring0 by syscall. No unit of work of any kind can happen without some thread working on it, the article just tries to sensationalize things from a very abstract position. Inside the userland async app, also, it might not be the same thread, but its always A thread. — mmix, Oct 11 '21 at 06:42

Marc Gravell · Accepted Answer · 2021-10-08T09:58:05.457

If you're the bottom of the async pile, with no inbuilt async downstream calls to defer to, then: it falls to you. The simple way to do this is to allocate a TaskCompletionSource<T> (TCS) for some T, hook up the async work (that isn't Task<T> based) in whatever way you need to, stick the TCS somewhere you can get at it later, and hand back the .Task from the TCS, to the caller. When the async work completes - possibly via some kind of callback, or whatever is suitable for that API; fetch the TCS from where-ever you stuffed it, and signal completion there, via TrySetResult etc.

There are various things to consider, though:

in many cases, you may want to ensure that you pass TaskCreationOptions.RunContinuationsAsynchronously to the TCS constructor, if "thread theft" would be a huge concern (otherwise, the await steals the thread of whatever calls .TrySetResult)
there are ways of creating and managing Task[<T>] instances without the additional allocation of a TaskCompletionSource<T>, but they're more advanced
or at the extreme end, if this is high throughput: ValueTask[<T>] has a token-based API (via IValueTaskSource[<T>]) that allows the same object model to be used many times (as different ValueTask[<T>] values), to avoid any additional allocations - again, this is an advanced scenario

Cool, this is actually what I have been looking for. Thanks for pointing me in the right direction, now I have to see how all this fares on Linux, but other than hardware specific P/I this looks like a solution... I'll as here if I run into trouble :D — mmix, Oct 08 '21 at 10:39
Btw, you said this is the simple way? What would the complicated way be (I presume with creating and managing Task instances you mentioned)? Can you point me towards some reading material? — mmix, Oct 08 '21 at 10:51
@mmix there are a few APIs you can use; `AsyncMethodBuilder` can do this for you, although how that happens depends on the framework version; or there's some related tools in the PooledAwait library on nuget - but if allocations "matters* (often it doesn't), then moving to `ValueTask[]` with `IValueTaskSource[]` might be a more practical direction — Marc Gravell, Oct 08 '21 at 12:05

Implement async "yielding" the proper way

1 Answers1