2

While using async and await, sometimes I come to a spot where it bugs me to use it because I sense it's pointless. I haven't been successful proving so is the case (and, admittedly, it doesn't hurt the performance to keep it). How can I validate (or reject) my claim in the following example.

bool empty = await Context.Stuff.AnyAsync();
if(empty)
  throw new Exception();

My claim is that - since we're using the result of the check immediately to verify if we should leave the method, that call needs to be actuated in sync. Hence, the following has no worse performance, I believe.

bool empty = Context.Stuff.Any();
if(empty)
  throw new Exception();

How can I verify my claim (other than empirically)?

Konrad Viltersten
  • 36,151
  • 76
  • 250
  • 438
  • Wouldn't performance depend on the implementation of `AnyAsync` and `Any`, and the state machine caused by `async` in the containing method, and the current thread usage, and a bunch of other things? – gunr2171 May 21 '22 at 04:15
  • 1
    Well this all depends on what is happening under the hood on the 'Any' call itself. So is it a DB call, API call, do you need it to complete before moving on? Async calls generally are meant to WAIT for something to finish or return something that might have some kind of side effect and you don't want your code to just not deal with the side effect. So to your question here it makes no sense you use Async if there is no need for it, as the next dev would have to figure out why the await call, specially it if does the same thing as the non-awaited version. – David B May 21 '22 at 04:16
  • @gunr2171 It would, of course. What I'm asking is whether it's obviously inferrable that **in this particular code**, such differences are irrelevant to the impact of the performance (since we're using the awaited value immediately and can't optimize). I'm comparing it to using `IEnumerable` immediately followed by e.g. `Count()`, which will iterate through the whole shabang anyway. In that case, we may go `T[]` right away with no deteriorated performance. What's your thought on that? – Konrad Viltersten May 21 '22 at 04:28
  • @DavidB So, to verify your point - in **this particular case** when I know for sure that I immediately will throw and exception for an empty set (hence being forced to evaluate it immediately not awaitedly), it makes no sense to use asynchronous call. Correct? – Konrad Viltersten May 21 '22 at 04:30
  • 1
    Presumably the immediate LINQ `Any()` function might take a long time to evaluate, locking up your UI while it is happening. The point of awaiting `AnyAsync()` is to keep the UI responsive while waiting for the result. What you do afterwards is irrelevant – Joe May 21 '22 at 04:40
  • https://learn.microsoft.com/en-us/windows/uwp/cpp-and-winrt-apis/concurrency#asynchronous-operations-and-windows-runtime-async-functions "Any Windows Runtime API that has the potential to take more than 50 milliseconds to complete is implemented as an asynchronous function (with a name ending in "Async")" If you follow that rule of thumb, there shouldn't be any confusion. Async or not, tell us how long that function takes to return a result. – Lex Li May 21 '22 at 04:49
  • @KonradViltersten - so the safer bet is the await - for example if you need the integrity to be very reliable then Await/Async it - or I've had situation where there was data being cached from a stream and we didn't care if the we caught it in the current loop cycle so I skipped the await altogether - this was for a listener for a video stream that looked for the video time on a network node that sent data to a in-memory db context - not stopping the application flow was most important. Does it matter in your case? I would think it would. So this is becoming opinioned :) – David B May 21 '22 at 05:02
  • @LexLi The operation `Any()` can take more than the stated threshold, for sure. So, as you suggest we should use the asynchronous version. Got it. Now, the question is whether **in this particular case** (i.e. when we'll need to evaluate the call to `Any()` anyway, to be able to determine the value of the boolean to control the execution flow (throwing an exception or not) - does it actually make a difference? NB. The question is not *should it be used* but *how to prove/refute whether it matters*. – Konrad Viltersten May 21 '22 at 05:15
  • @Joe There's no GUI. Just a simple API. I understand that using asynchronous version will prevent locking the process (GUI or not doesn't matter, of course). What I don't see is how to prove or refute that it actually affects anything in this particular case. I mean - if we await the call for `Any()`, we still need to "lock" the execution flow until the evaluation is finished so we know if we are up for throws or returns etc. What am I missing? – Konrad Viltersten May 21 '22 at 05:19
  • _"since we're using the result of the check immediately"_ is totally irrelevant for how `AnyAsync()` operates. It does not affect the "why use async" either. – H H May 21 '22 at 06:43

2 Answers2

6

I agree with all the comments; it's not about what you do with the result and when, it's about what the thread that was executing your code is allowed to go off and do elsewise while the Async operation is working out. If the Stuff is a complex view in the DB based on a query that takes 5 minutes to run then Any will block your thread for 5 minutes. AnyAsync could let that thread serve tens of thousands of requests to your webserver in that time. If you've blocked one thread the webserver will have to spin up another to serve the other people and threads are expensive.

Async isn't about "better performance" in the sense of "make it async and it runs faster" - the code executes at the same rate. Async is about "better use of resources" - you need fewer threads and they're more busy/less sitting around doing nothing waiting for e.g IO to complete

If it were an office it's analogous to making a coffee while you're on hold on the phone; imagine you get put on hold to the gas company and your boss shouts saying he wants a coffee. If you're async you'll put it on speaker, get up while you're on hold and make the coffee, waiting to be called back by the sound of the hold music stopping and the gas company saying "hello". If you're sync you'll sit there ignoring the boss' request while someone else makes the coffee (which means the boss has to employ someone else). It's more expensive to have you sitting around doing nothing just waiting, and have to hire someone else, than have you reach a point with job x and then go do something else. If you're async you'll go and refill the printer while you're waiting for the kettle to boil. If you're sync on hold and the office junior is sync waiting for the kettle to boil, the boss will have to employ yet another person to fill the printer..

Whether it's you or someone else that picks up the call to the gas company when they finally take you off hold depends on whether you're done making the coffee and available and/or whether you've ConfigureAwait'd to indicate it has to be you that picks up the call (true) or whether anyone in the office can continue it (false)

comments: I'm comparing it to using IEnumerable immediately followed by e.g. Count(), which will iterate through the whole shabang anyway. In that case, we may go T[] right away with no deteriorated performance. What's your thought on that?

It depends on what else you will do with the result. If you need to repeatedly ask your result for its length and random access it then sure, use ToArrayAsync to turn it into an array and then do all your work with it as locally cached data. Unless it's a query that is two terabytes big as a result

If you literally only need the count once, then it doesn't make sense to spend all that memory allocating an array and getting its length; just do the CountAsync

Neither of these seem entirely relevant to the question of "Async or no?" - if your IEnumerable is coming over a slow network and is some huge slow query it still goes back to "let the thread go off and make busy doing something else so you don't have to spin up more threads". Note that "slow" here could mean even tens of milliseconds. We don't have to be talking minute ops to see a benefit from async

Very fast operations sure, you can do them sync to save on the minuscule cost of setting up the state machine but be certain of the tipping point between the cost of setting up the state machine so the thread can do something else versus making it wait amount of time; the machine costs very little. Faced with the choice, I'd generally choose async if available, especially if any IO is involved

how to prove/refute whether it matters.

You'll have to race the horses for every case; how quickly does the op complete sync, how long does it take to do the async state management. It'd probably be quite a wearisome to do for an entire codebase which is why I tend to proceed on an "if async is available and isn't just available for async's sake, then probably someone has reasoned that using async is sensible, so we should use it" basis. Async all the way up spreading through a codebase is perhaps a good thing if you use its presence in a library as an indicator that you should leverage it in your code (which then indicates to users of your code that they should..)

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • Awesome. I think I got it now. The answer to my question is in short - *no, it does not matter to use async but...*, I guess. By that I mean that **given certain conditions**, the stuff won't be more efficient. However (and that's the point of your great answer), the set of conditions is rare and hard to establish in practice, so the answer is *no, it does not matter in a case that we can not determine if we have, hence it does matter due to uncertainty*. In short - it matters. :) – Konrad Viltersten May 21 '22 at 05:26
  • 1
    Yes, I'd say a reasonable assumption is "if it's available, it's been made available for a reason, so use it" - and then if there is something weird like "it's actively causing a performance issue in this real-time video streaming blah blah" and it becomes obvious then the particular use case can be looked at with a microscope, but it's generally better for human developer performance to just assume to use it, and then they're free to go off and do other coding rather than agonize- it helps us develop asynchronously rather than get blocked worrying – Caius Jard May 21 '22 at 05:31
  • 1
    I entirely and wholeheartedly agree. I meant in no way to question that. The reason of me asking is that we had a discussion in the team and while everybody agreed that we should await, nobody could explain **for this specific case** why or what the particular gain was. An experienced dev knows how to prevent problems by proper conduct even if they can't see precisely why. Like I don't see the bacteria but still wash my hands after visiting the toilet. :) – Konrad Viltersten May 21 '22 at 05:47
1

Hence, the following has no worse performance, I believe.

How can I verify my claim (other than empirically)?

There is no other way to verify a claim, other than empirically. Anything else is just words. You have to do an experiment and see the difference with your own eyes, or see a screenshot with the results of an experiment that was conducted by someone else. At the end of the day in order to verify something, an experiment has to be made by someone.

My guess is that if you do the experiment, you'll find that the synchronous Context.Stuff.Any() should have equal or better performance than the asynchronous await Context.Stuff.AnyAsync(). If it's better, the difference might be significant. Asynchronous APIs have been proven to be slower than synchronous APIs in more than one occasions. Personally I am not aware of any API that has both a synchronous and an asynchronous version, and the asynchronous is faster than the synchronous.

You haven't asked which version is more scalable though, so you might not be interested in this aspect of the equation. In case you are interested, conducting an experiment that compares the scalability of the two options is much more involved. You can't just use a Stopwatch, and measure the duration of a single operation. You'll have to launch a large number of operations concurrently, and observe how the system behaves as a whole. You could obtain metrics like CPU utilization, memory consumption, throughput etc. My expectation is that under heavy load the asynchronous version should give better metrics than the synchronous, and the difference might be substantial.

For what it's worth you can see here a somewhat silly experiment of mine, that proves that the asynchronous await Task.Delay() is vastly more scalable than the synchronous Thread.Sleep(). The later requires one thread per operation. The former requires a handful of threads for 100,000 operations.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104