0

This question is an extension of the answer to another question about memory barriers:

https://stackoverflow.com/a/3556877/13085654

Say you take that code example and tweak it to make the method async:

class Program
{
    static bool stop = false;

    public static void Main(string[] args)
    {
        var t = new Thread(async () =>
        {
            Console.WriteLine("thread begin");
            while (!stop)
            {
                if (false)
                {
                    await default(Task);
                }
            }
            Console.WriteLine("thread end");
        });
        t.Start();
        Thread.Sleep(1000);
        stop = true;
        Console.WriteLine("stop = true");
        Console.WriteLine("waiting...");
        t.Join();
    }
}

Obviously the await will never be hit, but just its presence in the method somehow allows the Release-compiled program to finish: Commenting out await default(Task); causes the program to once again hang (even if you leave the method marked async). This made me think that the compiler-generated state machine is significantly different based on whether or not the method contains at least one await. But comparing the IL for the parts of the method that are actually hit shows nearly identical instructions, both containing the following loop construct:

          // start of loop, entry point: IL_0039

            // [14 13 - 14 26]
            IL_0039: ldsfld       bool Program::stop
            IL_003e: brfalse.s    IL_0039
          // end of loop

At the IL level, I don't see what the C# compiler is doing to inject a memory barrier when an await is present. Does anyone have any insight as to how this is happening?

Liam
  • 27,717
  • 28
  • 128
  • 190
Wizard Brony
  • 111
  • 5
  • 1
    " shows nearly identical instructions" - could we see them please? – AakashM Sep 28 '20 at 15:32
  • 1
    `on whether or not the method contains at least one await` that goes without saying, and the JIT compiler is able to optimize this even farther. A *release build* may not even contain the `if` block. The code you posted doesn't do what you think though. No `Thread` constructor accepts asynchronous methods so the delegate is actually an `async void` method that's fired off and never awaited – Panagiotis Kanavos Sep 28 '20 at 15:42
  • @PanagiotisKanavos Since the `async` method returns synchronously, it does in fact wait for it to finish on the thread join. The assertion that it wouldn't would only be the case if the method actually was asynchronous and returned before completing. Given that they clearly ran the code, they were able to observe it writing to the console after the given time period. – Servy Sep 28 '20 at 15:48
  • 3
    No repro. This threading bug doesn't actually have anything to do with a memory barrier. What matters is whether the just-in-time compiler treats the variable as volatile. In other words, whether it generates code that reads the bool value from memory instead of re-using the value, stored in a processor register. Notoriously, the x86 jitter does not. But can very easily be bumped into it with a seemingly trivial change to the code. Only sane advice: *don't do it*. – Hans Passant Sep 28 '20 at 16:32
  • _"At the IL level, I don't see what the C# compiler is doing to inject a memory barrier"_ -- why would you expect to? Your program isn't executing IL. It's executing the JITted code generated at runtime from the IL. If you really want to compare and see what the salient difference is, you need to look at the native code. the JIT compiler may or may not make certain optimizations based on any number of factors related to the code, including simply the _length_ of the code (which is obviously different in the with-await case vs. without). – Peter Duniho Sep 28 '20 at 18:32

1 Answers1

3

Does anyone have any insight as to how this is happening?

You've written a program that relies on undefined behavior. Thus, the results are undefined. Either implementation is allowed to produce either result. If you want to rely on either result, then neither implementation is acceptable. The language provides no guarantee that if there is no memory barrier that a variable cannot observe updates from another thread, merely that it may or may not. If you want to force the updated value to be read, use a memory barrier, as that's what it's there for. To force the value to not reflect changes from another thread, use a copy of the variable specific to that thread.

Servy
  • 202,030
  • 26
  • 332
  • 449
  • Adding to that, no Thread constructor understands about tasks so the Thread's delegate is an `async void` that calls a dead branch that either the C# or JIT compiler may choose to eliminate – Panagiotis Kanavos Sep 28 '20 at 15:42
  • @PanagiotisKanavos While the method isn't actually asynchronous anyway, so that doesn't really matter. The code is a toy example to demonstrate a point, evaluating how well its written isn't really an issue in such a case.. – Servy Sep 28 '20 at 15:46
  • @Servy That's fair. But other framework utilities perform undocumented (from what I can tell) implicit volatile reads. For example, Jon Skeet says you can trust `lock` to perform an implicit volatile read: "... a call to Monitor.Enter performs an implicit volatile read, and a call to Monitor.Exit performs an implicit volatile write." https://jonskeet.uk/csharp/threads/volatility.html I was wondering if there was possibly something like that for async methods. – Wizard Brony Sep 28 '20 at 15:54
  • @WizardBrony `lock` is implemented using `Monitor`, so they're the same thing. A memory barrier or a volatile read/write is a fairly weak limitation on how operations are allowed to be observed to be ordered, the Monitor methods are *stronger* limitations on how methods are allowed to be observed to be ordered. That's documented behavior, it's not that `Monitor` uses a memory barrier as an implementation detail. `Task` has some limitations, but they revolve around the operations on the task itself, async methods aren't documented to affect how operations in it are ordered between threads. – Servy Sep 28 '20 at 16:06
  • @Servy Okay, so for example: `int a; lock (_lock){ a = A; } var b = B;` So you're saying even though `Monitor` is implemented using a memory barrier, I can't rely on that to make sure `B` is read after `A`? – Wizard Brony Sep 28 '20 at 16:26
  • 2
    @WizardBrony No, I'm saying Monitor specifically documents the fact that the operations after the lock cannot be observed to run before it, so how it implements it, and whether or not it uses memory barriers, is irrelevant. It provides *stronger* guarantees than what a memory barrier does with respect to how those operations can be observed to be reordered, not weaker ones. – Servy Sep 28 '20 at 16:34