Why is this optimized away by modern compilers for C++11 and higher

Question

I'm lost.. I wanted to play around with the compiler explorer to experiment with multithreaded C code, and started with a simple piece of code. The code is compiled with -O3.

static int hang = 0;

void set_hang(int x) {
    hang = x;
}

void wait() {
    while (hang != 0) {
        // wait...
    }
}

To my surprise, this was the compiler output:

set_hang(int):
        mov     dword ptr [rip + hang], edi
        ret
wait():
        ret

It took me a while before I noticed that I was compiling the code as C++ instead of C. Switching to C gave me something what I would have expected:

set_hang:
        mov     DWORD PTR hang[rip], edi
        ret
wait:
        mov     eax, DWORD PTR hang[rip]
        test    eax, eax
        je      .L3
.L5:
        jmp     .L5
.L3:
        ret

Thus, when compiled as C++, wait() always returns, no matter which value was passed before to set_hang(). I confirmed this by compiling and the code on my PC. This code immediately exists, while I would expect it to hang foreever:

int main(void) {
    set_hang(1);
    wait();
    return 0;
}

And indeed, if I compile this with gcc instead of with g++, it hangs.

I experimented with different compilers (Clang and GCC), and this only happens with Clang 12.0.0 or higer or GCC 10.1 or higher. If I pass --std=c++98 also the code I would expect is emitted, so it seems to be something specific for C++11 and higher.

Removing the static keyword from the hang doesn't affect the emitted assembly.

What is happening here? It has been a few months since I wrote C++, so I might be missing some knowledge about the latest and greatest exotic C++ black magic, but this is really straightforward code. I'm clueless.

Edit: Even this program is optimized away completely:

// test.cpp
static int hang = 0;

static void set_hang(int x) {
    hang = x;
}

static void wait() {
    while (hang != 0) {
        // wait...
    }
}

int main(void) {
    set_hang(1);
    wait();
    return 0;
}

Compiler output:

main:
        xor     eax, eax
        ret

For GCC version 10.3.0 on Ubuntu:

This command will hang: g++ -O1 -o test test.cpp && ./test

And this command won't: g++ -O2 -o test test.cpp && ./test

Generally speaking, modern compilers are free to optimize your code as if it were single-threaded code. From that perspective, it's clear that your `wait()` code will never enter the loop, so there's no point in even running it. The situation would probably change if you marked your field as `volatile` (at least that would be the keyword in Java -- sorry, I'm not a C++ guy). — devoured elysium, Dec 28 '21 at 12:11
I do know that modern versions of C++ define a memory model (inspired by the JVM's), and it probably was introduced in C++ 11. That's why new compilers feel free to make such assumptions while older ones do not. — devoured elysium, Dec 28 '21 at 12:13
Indeed, when compiled as C code with clang 5.0.0 or higher (with `-O3 --std=c11`), the while loop is optimized out. Even the latest version of GCC (11.2) won't optimize the loop out. So could we say that GCC doesn't completely conform to C11? — Bart, Dec 28 '21 at 13:11
@Bart No. Not optimising something that the compiler is allowed to - but not required to - optimise doesn't imply non-conformance. — eerorika, Dec 28 '21 at 13:14

score 6 · Accepted Answer · answered Dec 28 '21 at 12:48

It's because of following rule:

[intro.progress]

The implementation may assume that any thread will eventually do one of the following:

terminate,

make a call to a library I/O function,

perform an access through a volatile glvalue, or

perform a synchronization operation or an atomic operation.

The compiler was able to prove that a program that enters the loop will never do any of the listed things and thus it is allowed to assume that the loop will never be entered.

Why is this optimized away by modern compilers for C++11 and higher

1 Answers1