1

I have a simple C++ code snippet as shown below:

int A;
int B;

void foo() {
    A = B + 1;
    // asm volatile("" ::: "memory");
    B = 0;
}

When I compile this code, the generated assembly code is reordered as follows:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

However, when I add a memory fence (commented line in the C++ code), the instructions are not reordered. My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering. So, I modified the code to add volatile to variable B:

int A;
volatile int B;

void foo() {
    A = B + 1;
    B = 0;
}

To my surprise, the generated assembly code still shows reordered instructions. Can someone explain why the volatile qualifier did not prevent instruction reordering in this case?

Code is available in godbolt

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
tang
  • 17
  • 7
  • 3
    `asm volatile ...` is a compiler-specific extension, which does something completely different from the C++ standard `volatile` [qualifier](https://en.cppreference.com/w/cpp/language/cv). – Some programmer dude Jun 27 '23 at 13:09
  • 1
    Not sure what your question is about, but you only marked `B` volatile and the accesses for that are not reordered (read first, zero second). If you want the write to `A` to be ordered, you need to mark that as volatile as well. – Jester Jun 27 '23 at 13:11
  • 1
    Also, what happens, and what assembly code might be generated, depends very much on compiler, compiler version, and optimization flags. Please include all those details in the question itself. And also please tell us what language you're really writing your code in. You originally tagged C++11 (which I edited to C++) but in the question itself you only mention C. C and C++ are two very different languages. – Some programmer dude Jun 27 '23 at 13:13
  • @Someprogrammerdude You could see the link I posted, it has all the info you need [godbolt](https://godbolt.org/z/WhhcY1G9Y) – tang Jun 27 '23 at 13:27
  • @Jester Ah! I believe I misunderstood the meaning of "reorder". If a variable is volatile, it means that all access(read and write) to it will not be reordered. Initially, I thought it meant that the code sentences would not be reordered. – tang Jun 27 '23 at 13:30
  • all the info needed should be in the question, not behind links to external sites – 463035818_is_not_an_ai Jun 27 '23 at 13:33
  • Please try to keep your questions *self-contained*. External links can disappear without notice, or their contents change and make them irrelevant. – Some programmer dude Jun 27 '23 at 13:33
  • godbolt link is to C++, so why do you talk about C in the question? Which one is the quesiton about ? – 463035818_is_not_an_ai Jun 27 '23 at 13:33
  • @Someprogrammerdude I apologize for any inconvenience caused by sharing an external link. I will make sure to keep my questions and responses self-contained in the future. – tang Jun 27 '23 at 13:46
  • @463035818_is_not_an_ai Apologies for the confusion in the question. I have already corrected the inconsistency, and the question is indeed about C++ – tang Jun 27 '23 at 13:50
  • `volatile` accesses aren't reordered with any other `volatile` accesses, including to other locations, so making both variables volatile would give you the asm you're expecting. But don't, use `atomic` with `memory_order_relaxed` instead if another thread will be accessing these globals. (*[When to use volatile with multi threading?](https://stackoverflow.com/a/58535118)*) – Peter Cordes Jun 27 '23 at 16:30
  • Thanks for your response @PeterCordes. I have made some conclusions regarding the use of `volatile`. You can find them in the comments section of Nate Eldredge's answer. Please feel free to check them out. – tang Jun 27 '23 at 16:38

1 Answers1

4

My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering.

That's a major oversimplification. Although the C++ standard doesn't define the semantics of volatile very explicitly (saying only that "accesses are evaluated strictly according to the rules of the abstract machine"), the unwritten rule is that volatile objects are treated as if some external entity (e.g. I/O hardware) may be reading and writing them asynchronously, and that both reads and writes are side effects that the external entity can observe. As such, each read/write to a volatile object (of machine word size or less) should result in the execution of exactly one load/store instruction.

From this it follows that loads and stores to volatile objects will not be reordered with each other. But in your program A is not volatile, so we assume that the external entity does not see it. Therefore it does not matter how the accesses to A are ordered with respect to accesses to B or anything else, and the compiler is free to reorder them. Instructions like add eax, 1 that do not access memory at all are also fair game; the external entity can't see the machine registers either.

Per your use of the tag, this is one of the many reasons that volatile is not the right approach for variables to be shared between threads - because unlike the "external entity", another thread does have access to your non-volatile variables. In olden times prior to C++11, people used volatile because it was all there was, and you could make it work, with the use of explicit memory barrier functions, if you knew something about the way your compiler did optimizations (which was usually undocumented). Since C++11 we have std::atomic and that is the only right way to handle inter-thread sharing, but unfortunately the association with volatile lingers on in obsolete docs and the minds of old-timers. See Why is volatile not considered useful in multithreaded C or C++ programming? for more.

Also relevant: Does the C++ volatile keyword introduce a memory fence? (No, it does not, as you have discovered.)

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • Thank you for your message! Upon researching the `volatile` qualifier, I have concluded that it serves three primary purposes: 1. It guarantees that a `volatile` variable will always be read from memory and not from a register. 2. It ensures that the compiler will not optimize out any code related to the `volatile` variable. 3. It guarantees that the order of instructions involving the `volatile` variable and another `volatile` variable will not be altered. Have I missed anything? – tang Jun 27 '23 at 16:34
  • That's most of it. There's also that a reasonable compiler will not *invent* loads or stores to `volatile` variables beyond what was actually requested in the code. It also will not merge `volatile` accesses with adjacent ones, or split them into multiple smaller accesses. All of these can occur with non-`volatile` variables, see https://stackoverflow.com/questions/71866535/which-types-on-a-64-bit-computer-are-naturally-atomic-in-gnu-c-and-gnu-c-m/71867102#71867102 for some examples. – Nate Eldredge Jun 27 '23 at 22:55
  • 1
    @tang: But some other caveats are worth noting. `volatile` ensures that the relevant load and store instructions are *executed* in program order, i.e. the program counter will pass through the corresponding assembly instructions in the correct order, but that does not necessarily ensure that they become *visible* to other cores or external entities in that order; they can be reordered by the CPU. This is the whole point of memory ordering. [...] – Nate Eldredge Jun 27 '23 at 22:57
  • 1
    For memory-mapped I/O, the machine will often have been configured (in hardware or by the OS) to inhibit reordering for those address ranges, in which case `volatile` alone is enough. But if not, you may need additional barrier instructions, which the compiler will not insert for you; you'll need to use inline asm or appropriate function calls. By contrast, `std::atomic` accesses *do* insert barriers according to the requested memory ordering (acquire, release, sequentially consistent, etc). – Nate Eldredge Jun 27 '23 at 23:00
  • 1
    @tang: The other note on `volatile` is that it does not enjoy an exemption from the data race rules like `std::atomic` does. If you access a non-`std::atomic` variable from two threads without synchronization, unless they are both read-only, it is a data race and your program's behavior is undefined - **and this applies even if the variable is `volatile`**. Only `std::atomic` is safe in this respect. – Nate Eldredge Jun 27 '23 at 23:02
  • Thank you for your comprehensive explanation! – tang Jun 28 '23 at 05:47