21

I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the remaining code. This way threads are efficiently utilized. But I am confused in some use cases:

  1. Should we use async methods for the very fast IO method, like cache read/write method? Would not they result into unnecessarily context switch. If we use sync method, execution will complete on same thread and context switch may not happen.
  2. Does Async-await saves only memory consumption(by creating lesser threads). Or it also saves cpu as well? As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
Pragmatic
  • 3,093
  • 4
  • 33
  • 62
  • 1
    async await necessarily does not create new threads all the time. There are lot of questions around it in SO, you need to start searching. – Vignesh.N Sep 30 '16 at 15:46
  • 1
    Ture, It pulls thread from ThreadPool to execute remaining part of the method. This is what I mentioned – Pragmatic Sep 30 '16 at 15:48
  • also async await is more about operation chaining and letting the framework manage context-switching if at all it may be required. – Vignesh.N Sep 30 '16 at 15:50
  • 1
    @Pragmatic It doesn't even necessarily use the thread pool. For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result. – EJoshuaS - Stand with Ukraine Sep 30 '16 at 15:55
  • @EJoshuaS, It can be the same thread (not necessarily new thread). But if there are few more statement after "await" keyword, those would be executed in a thread from pool. – Pragmatic Sep 30 '16 at 16:02
  • 1
    @Pragmatic Only if there isn't a synchronization context, the synchronization context uses the thread pool, or `ConfigureAwait(false)` is used. Granted, that's *often* in many applications, but certainly not *always*. – Servy Sep 30 '16 at 16:04

2 Answers2

52

I am aware of how async await works.

You are not.

I know that when execution reaches to await, it release the thread

It does not. When execution reaches an await, the awaitable operand is evaluated, and then it is checked to see if the operation is complete. If it is not, then the remainder of the method is signed up as the continuation of the awaitable, and a task representing the work of the current method is returned to the caller.

None of that is "releasing the thread". Rather, control returns to the caller, and the caller keeps executing on the current thread. Of course, if the current caller was the only thing on this thread, then the thread is done. But there is no requirement that an async method be the only call on a thread!

after IO completes

An awaitable need not be an IO operation, but let's suppose that it is.

it fetches thread from threadpool and run the remaining code.

No. It schedules the remaining code to run on the correct context. That context might be a threadpool thread. It might be the UI thread. It might be the current thread. It might be any number of things.

Should we use async methods for the very fast IO method, like cache read/write method?

The awaitable is evaluated. If the awaitable knows that it can complete the operation in a reasonable amount of time then it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.

Would not they result into unnecessarily context switch.

Not necessarily.

If we use sync method, execution will complete on same thread and context switch may not happen.

I am confused as to why you think a context switch happens on an IO operation. IO operations run on hardware, below the level of OS threads. There's no thread sitting there servicing IO tasks.

Does Async-await saves only memory consumption(by creating lesser threads)

The purpose of await is to (1) make more efficient use of expensive worker threads by allowing workflows to become more asynchronous, and thereby freeing up threads to do work while waiting for high-latency results, and (2) to make the source code for asynchronous workflows resemble the source code for synchronous workflows.

As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?

Sure but you have this completely backwards. YOU WANT TO CONSUME CPU. You want to be consuming as much CPU as possible all the time! The CPU is doing work on behalf of the user and if it is idle then its not getting its work done as fast as it could. Don't hire a worker and then pay them to sleep! Hire a worker, and as soon as they are blocked on a high-latency task, put them to work doing something else so the CPU stays as hot as possible all the time. The owner of that machine paid good money for that CPU; it should be running at 100% all the time that there is work to be done!

So let's come back to your fundamental question:

Does async await increases Context switching

I know a great way to find out. Write a program using await, write another one without, run them both, and measure the number of context switches per second. Then you'll know.

But I don't see why context switches per second is a relevant metric. Let's consider two banks with lots of customers and lots of employees. At bank #1 the employees work on one task until it is complete; they never switch context. If an employee is blocked on waiting for a result from another, they go to sleep. At bank #2, employees switch from one task to another when they are blocked, and are constantly servicing customer requests. Which bank do you think has faster customer service?

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 1
    Thanks Eric for your answer! So we can say that we should always use aync API (if available), no matter if it run very fast or takes time. I need clarification on one point, if there are 100 concurrent threads running (on 4 core machine), cpu will switch from one thread to another if thread one start IO. Or this is purely based on time slicing and all share gets their equal share of cpu (no matter if they are cpu bound or IO bound). Assuming sync IO API is used in this case. – Pragmatic Sep 30 '16 at 17:56
  • So is CPU usage reduced? Unclear from this answer, but this was the question. "Just measure it" is not an answer since a) it's not possible to measure all relevant scenarios and b) it does not provide any insight. – usr Sep 30 '16 at 18:02
  • 5
    @usr: "just measure it" is indeed an answer and is by far the *most* useful answer. If you have some horses and you wish to know which is the faster, asking a bunch of strangers on the internet who have never seen the horses is far less useful than racing the horses. – Eric Lippert Sep 30 '16 at 18:03
  • @Pragmatic: You should read "there is no thread" by Stephen Cleary. It will help you understand what is really going on when you start an IO operation. The key takeaway is: there is no such thing as a synchronous IO operation. It's all asynchronous; if it looks synchronous, someone is building a synchronous mechanism on top of an asynchronous one. If you need the details of that mechanism, examine that mechanism. – Eric Lippert Sep 30 '16 at 18:05
  • I provided two arguments for why I disagree with that stance. I feel your reply should address them. You can't race all possible horses. You have to find out which horse race has what properties so that you can make predictions based on that understanding. – usr Sep 30 '16 at 18:05
  • 1
    @usr: And if it is truly *impossible* to measure *relevant scenarios* then by your supposition the question is impossible to answer correctly. Suppose I said "well, performance is improved in all situations, but in most of them you can't measure it to see whether performance is actually improved." Does that make any sense? Is a performance improvement that cannot be measured actually an improvement? How could you distinguish it from a regression? – Eric Lippert Sep 30 '16 at 18:07
  • If you claimed that performance was generally improved or equal in most cases (which I think you did not) I would not have made my comment. I would have disagreed with this opinion but at least there would have been a tangible statement for what can be expected. – usr Sep 30 '16 at 18:09
  • 1
    @usr: The only blanket statements that I'm making regarding performance are (1) leaving threads idle when they could be doing work degrades performance; "await" is intended to provide a tool for mitigating this problem, and (2) attaining high performance is an engineering discipline that requires empirical observations of real programs run under controlled conditions. – Eric Lippert Sep 30 '16 at 18:13
  • "it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.". I have to disagree on that. As soon as you mark a method as async the compiler has to rewrite the method just to be on the safe side even if all the tasks that are awaited on are already finished - that transformation is far from free. The returned task requires a heap application which is also not free. Then there's the issue of blocks in leaf methods which cause multiple allocations and so on. – Voo Sep 30 '16 at 21:13
  • 1
    Now all those things are relatively minor but they do add up in some situations. The Midori team didn't redesign large parts of the c# async infrastructure because of not invented here syndrome, but because there was a real need. – Voo Sep 30 '16 at 21:15
  • @Voo: You (and of course usr) are right that there is some overhead in terms of more objects being created, more code running, and so on. (Whether that cost is significant to the user of the product requires consultation with the user and careful testing, of course.) I was trying to draw a more narrow conclusion: that awaiting an already-completed task does *not* introduce a point at which the continuation is scheduled to an arbitrarily distant time in the future. The continuation runs synchronously in this case. – Eric Lippert Sep 30 '16 at 21:17
  • @Eric Fair point on which I'm sure there's no contention among people knowing the infrastructure. I'm doing some performance sensitive work these days and just had to give up some beliefs I held for a very long time among those "short lived small allocations are basically free" and "async is always better if there's blocking involved". Both of those are very wide spread so I'm trying to show the limits of these assumptions in questions such these a bit. – Voo Sep 30 '16 at 21:22
  • 1
    @Voo: Indeed, stuff is cheap *right up until it isn't*. As you note correctly, the Midori people were up against situations where the costs of things I routinely ignore -- a single allocation, a single context switch, a single uncontended lock -- were actually the gating factor in their applications. When a hundred nanoseconds of fussing around with a monitor is the *most expensive thing*, you've got a pretty fast program! – Eric Lippert Sep 30 '16 at 21:41
  • @EricLippert If possible please add this video link by Jeffrey Ritcher about the reason for async await - https://www.youtube.com/watch?v=hB0K1JWFoqs This might also help many people in understanding async await usage – Rudresha Parameshappa Apr 29 '18 at 22:12
  • 1
    Hi, I have some question about thread releasing. The official documentation (https://learn.microsoft.com/en-us/dotnet/standard/async-in-depth) says that "The server with async code running on it still queues up the 6th request, but because it uses async and await, each of its threads are freed up when the I/O-bound work starts, rather than when it finishes. By the time the 20th request comes in, the queue for incoming requests will be far smaller (if it has anything in it at all), and the server won't slow down.". What's the difference between its "freed up" and "released"? – user3003238 Jun 21 '18 at 05:58
  • @user3003238: This is a question and answer site; if you have a question, then post it as a question! But I note that I spent a fair amount of this answer saying that there's no such thing as "thread releasing". What the original poster meant by "releasing" a thread is simply *control returns to the caller*. When you have method `bar()` that calls `foo()` and foo returns, has foo "released the thread back to bar"? – Eric Lippert Jun 21 '18 at 18:44
  • 1
    @user3003238: Now, if the caller is the thread pool, then in that sense a thread is "freed up" or "released", since the thread goes back in the pool and the continuation of the task is scheduled onto a new thread out of the pool. – Eric Lippert Jun 21 '18 at 18:47
6

Should we use async methods for the very fast IO method, like cache read/write method?

Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.

This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.

Or it also saves cpu as well?

It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).

Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.

That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.

On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.

That means it does not consume cpu.

It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.

An anonymous person said:

For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.

Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)

A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.

usr
  • 168,620
  • 35
  • 240
  • 369
  • Thanks usr for answer. What do you mean by "Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch" – Pragmatic Sep 30 '16 at 17:43
  • 3
    Context switches can be saved if multiple IOs are initiated without blocking and later multiple IOs are completed without blocking. Then, you need 2 context switches in total for *many* IOs. That's what saves switching. But if IOs are infrequently, which I claim is the case quite often, then it's not possible to coalesce the initiation and completion work into one time quantum. The code will have to block and switch away until new work arrives. – usr Sep 30 '16 at 17:47
  • 1
    Thanks, That makes sense! – Pragmatic Sep 30 '16 at 18:19