5

When I write asynchronous code with async/await, usually with ConfigureAwait(false) to avoid capturing the context, my code is jumping from one thread-pool thread to the next after each await. This raises concerns about thread safety. Is this code safe?

static async Task Main()
{
    int count = 0;
    for (int i = 0; i < 1_000_000; i++)
    {
        Interlocked.Increment(ref count);
        await Task.Yield();
    }
    Console.WriteLine(count == 1_000_000 ? "OK" : "Error");
}

The variable i is unprotected, and is accessed by multiple thread-pool threads*. Although the pattern of access is non-concurrent, it should be theoretically possible for each thread to increment a locally cached value of i, resulting to more than 1,000,000 iterations. I am unable to produce this scenario in practice though. The code above always prints OK in my machine. Does this mean that the code is thread safe? Or I should synchronize the access to the i variable using a lock?

(* one thread switch occurs every 2 iterations on average, according to my tests)

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • 1
    Why do you think `i` is cached in each thread? See [this SharpLab IL](https://sharplab.io/#v2:CYLg1APgAgTAjAWAFBQAwAIpwHQBUAWATgKYCGwAlgHYDmA3MmpnAKwMoDMmM6AwugG9k6EcwBsmAByYJAWVLUAFAEphooUlFb01AC7oAxgHsArlX0BedKnbbRAMyOF0ivTvRWb7gDzo4AfVQgwKC6HTAwVU07QTUY0QBJc2JCABsjAwBrYmBsJIMSAFtic0USe0NTc2VbeNEoAE4ZbABNCmJU4BVauwBfOO0sBsVjM0srAKDUEIwAfnQAIgB5AGkF9BBFgFFCQicFmoH+pF6gA=) to dig deeper. – AndreasHassing Oct 07 '19 at 06:33
  • 1
    @AndreasHassing My concerns are raised by statements like this: *The compiler, CLR, or CPU may introduce caching optimizations such that assignments to variables won't be visible to other threads right away.* [Part 4: Advanced Threading](http://www.albahari.com/threading/part4.aspx) – Theodor Zoulias Oct 07 '19 at 06:48

2 Answers2

2

The problem with thread safety is about reading/writing memory. Even when this could continue on a different thread, nothing here is executed concurrent.

Jeroen van Langen
  • 21,446
  • 3
  • 42
  • 57
  • A thread could theoretically read from and write to a local cache instead of the main RAM, missing this way an update made by an other thread. The variable `i` is neither declared [`volatile`](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/volatile) nor protected by a lock, so from my understanding the compiler, the Jitter and the hardware (CPU) are all permitted to make an optimization like this. – Theodor Zoulias Oct 07 '19 at 09:18
  • @TheodorZoulias Swapping out a thread to resume a continuation is not the same as concurrent access. In the sharplab that is linked above, you can see the entire state machine, that encapsulates the locals in private fields gets passed to the thread that will execute the continuation. Only 1 thread is accessing `i` at any given time. – JohanP Oct 07 '19 at 22:58
  • @JohanP the field `private int 5__2` in the state machine is not declared `volatile`. My concerns are not about a thread interrupting another thread which is in the middle of updating `i`. This is impossible to happen in this case. My concerns are about a thread using a stale value of `i`, cached in the local cache of the core of the CPU, left there from a previous loop, instead of fetching a fresh value of `i` from the main RAM. Accessing the local cache is cheaper than accessing the main RAM, so with optimizations ON such things are possible (according to what I've read). – Theodor Zoulias Oct 08 '19 at 00:08
  • @TheodorZoulias do you have the same concern if this loop had no `async` code in there? – JohanP Oct 08 '19 at 00:11
  • @JohanP removing the `await Task.Yield()` the code will run in a single thread, so no concern in this case. – Theodor Zoulias Oct 08 '19 at 00:24
  • @TheodorZoulias With `await Task.Yield()` your code is still single threaded. – JohanP Oct 08 '19 at 00:30
  • @JohanP depends on the definition of "single-threaded". The definition is not important. Important is the thread safety of my code. Is it guaranteed that will print the expected output (OK) when run on any CPU architecture with optimizations ON? From what I've read the x86/x64 systems have strong memory models, while systems based on ARM or Itanium have weaker memory models. I can test it only on my own PC though (x64). – Theodor Zoulias Oct 08 '19 at 00:44
  • @TheodorZoulias You shouldn't be concerned about the thread safety of your code posted above because it is single threaded. The same way you are not worried about the code without `await Task.Yield()`. If the memory model is weak, then whether you have `await` or not, both will have the same outcome. – JohanP Oct 08 '19 at 00:54
  • @JohanP the `await` causes thread switches. Only one thread runs my code at a time, but not the same thread from start to end. To guarantee the thread safety of my code under these conditions, currently I see no other option than protecting the shared variables with locks ([example](https://stackoverflow.com/questions/56518305/how-to-use-c8-iasyncenumerablet-to-async-enumerate-tasks-run-in-parallel/58242855#58242855)). Until I learn better of course (which is the reason I opened this question). – Theodor Zoulias Oct 08 '19 at 01:43
  • 2
    @TheodorZoulias Thread A runs, increments `i`. Code hits `await`, Thread A passes all the state to Thread B and goes back into the pool. Thread B increments `i`. Hits `await`. Thread B then passes all of the state to Thread C, it goes back into the pool etc etc. At no point is there concurrent access to `i`, there is no thread safety needed, it does not matter that a thread switch occurred, all of the state needed is passed into the new thread running the continuation. There is no shared state that's why you don't need synchronization. – JohanP Oct 08 '19 at 02:02
  • @JohanP you could write this as an answer. :-) – Theodor Zoulias Oct 08 '19 at 02:06
  • @JohanP Yes, _".....nothing here is executed concurrent."_ – Jeroen van Langen Oct 08 '19 at 07:05
  • Yeap, *"...the pattern of access is non-concurrent..."* :-) – Theodor Zoulias Oct 08 '19 at 12:15
0

I believe this article by Stephen Toub can shed some light on this. In particular, this is a relevant passage about what happens during a context switch:

Whenever code awaits an awaitable whose awaiter says it’s not yet complete (i.e. the awaiter’s IsCompleted returns false), the method needs to suspend, and it’ll resume via a continuation off of the awaiter. This is one of those asynchronous points I referred to earlier, and thus, ExecutionContext needs to flow from the code issuing the await through to the continuation delegate’s execution. That’s handled automatically by the Framework. When the async method is about to suspend, the infrastructure captures an ExecutionContext. The delegate that gets passed to the awaiter has a reference to this ExecutionContext instance and will use it when resuming the method. This is what enables the important “ambient” information represented by ExecutionContext to flow across awaits.

Worth noting that the YieldAwaitable returned by Task.Yield() always returns false.

Daniel Crha
  • 675
  • 5
  • 13
  • Thanks Daniel for the answer. To be honest I would be surprised if the flowing of the [`ExecutionContext`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.executioncontext) from thread to thread served also as a mechanism for invalidating the thread's local caches. But it's not impossible either. – Theodor Zoulias Oct 17 '19 at 23:01
  • Maybe an expert like @RaymondChen could assert if your answer is right or wrong. I believe that very few people in the world can serve as credible sources of information about this issue. – Theodor Zoulias Oct 17 '19 at 23:08
  • "Invalidating the thread's local caches" would imply that when a thread performs a context switch, it somehow also maintains a cache which is specific to this one context. That would mean this cached data has to be stored in something that resembles a context... but why, when the real context is available to the thread which will have to execute it? It would also bring about the problem of determining which two contexts are the "same", but just representing a later point in execution. Of course I don't claim to be an expert, just trying to reason about the problem as a mental exercise. – Daniel Crha Oct 17 '19 at 23:49
  • Also, in case I'm wrong, I might invoke Cunningham's law: "The best way to get the right answer on the Internet is not to ask a question, it's to post the wrong answer." – Daniel Crha Oct 17 '19 at 23:50
  • I am talking about hardware caches. If you want take a look at these: [Volatile keyword in C# – memory model explained](https://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/) *When you read a non-volatile field in C#, a non-volatile read occurs, and you may see a stale value from the thread’s cache.* [Common Multithreading Mistakes in C#](http://benbowen.blog/post/cmmics_iii/) *When a variable must be modified, it is first loaded from RAM to the L3 cache, then on to the relevant core's L2 and L1 caches before finally being manipulated in the core itself.* – Theodor Zoulias Oct 17 '19 at 23:59
  • 1
    But a hardware cache is not thread-specific. In fact even single-threaded code could be forced to yield by preemptive multitasking from the OS side, and it could resume execution on a different processor (and thus a different L1 and L2 cache). This cache invalidation isn't specific to `async` or `await`. Cache invalidation during a context switch would affect single and multi-threaded code the same way. – Daniel Crha Oct 18 '19 at 00:28
  • I really don't know. The information in these articles is extremely complicated and makes me confused. What I really hope is that some expert will come up and answer this question with a simple "Don't worry, it's all safe", or "You are right to worry, protect your variables", so that I can go on with my life without delving any deeper in the details of hardware caches and all that stuff. :-) – Theodor Zoulias Oct 18 '19 at 00:37