Gcc optimizes condition

Question

I have the following code:

#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

char condA = 0;
volatile int vA = 0;
volatile int vB = 0;
void* thread_a(void* arg) {
    while (1) {
        if (1 == condA) {
            vA += 1;
            usleep(100);
        } else {
            ;
//          vB += 1;
//          usleep(100);
        }
    }
    return NULL;
}

void* thread_b(void* arg) {
    condA = 1;
    return NULL;
}

int main(void) {

    pthread_t threadA;
    pthread_t threadB;

    if (0 != pthread_create(&threadA, NULL, thread_a, NULL)) {
        return -2;
    }
    sleep(1);

    if (0 != pthread_create(&threadB, NULL, thread_b, NULL)) {
        return -1;
    }
    pthread_join(threadB, NULL);
    sleep(1);
    pthread_cancel(threadA);

    printf("value A is %d\n", vA);
    printf("value B is %d\n", vB);
    return 0;
}

I run my gcc 5.4.0 with either -O0 and -O1.

On O0, the output is: value A is 501

On O1, the output is: value A is 0

I know that setting condA to volatile will fix this, but why is it so? Why does the compiler optimize the condition if it is obviously set in another thread?

Also, if I uncomment the code in the else-statement, it works as expected.

A partial answer I found: https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/Volatiles.html#Volatiles "C has the concept of volatile objects. These are normally accessed by pointers and used for accessing hardware or inter-thread communication.[...]"

But this does not explain the fact, that if I uncomment the code in the else statement the program will work with -O1.

Isn't your code wrong? As far as I see in the docs, `pthread_create` returns an `int` indicating success/failure, _not_ a `pthread_t`. gcc will tell you about it if you compile with `-Wall`, as you should. — sleske, Jan 28 '20 at 10:28
`volatile` is not the right tool for multithreading. Don't use that. Use atomic types/operations which have guaranteed semantics. — Ulrich Eckhardt, Jan 28 '20 at 10:30
Also, the first argument to `pthread_create` must be a `pthread_t*`. I don't think you can pass `NULL` there. At least for me (Ubuntu Linux, gcc 7.4.0, glibc 2.27), the program compiles with warnings, but segfaults on the first call to `pthread_create` (as I would expect). — sleske, Jan 28 '20 at 10:36
Please edit your code so it compiles w/o warnings and actually runs. Voting to close in the meantime. — sleske, Jan 28 '20 at 10:37
The compiler doesn't know anything about threads. Of course it knows that you call library functions named `pthread_create`, `pthread_join` and `pthread_cancel`, but it doesn't know what they will do. `volatile` is the right means to inform the compiler that the value can be modified outside the normal control flow, e.g. by hardware or by a different thread or by an interrupt or signal handler. This will disable any optimizations related to the value marked as `volatile`. You should use `volatile` in addtition to using atomic types or synchronized access. — Bodo, Jan 28 '20 at 10:38
It seems like you're trying to implement your own spinlock. __*DON'T!*__ To see Linus Torvalds chewing out a Google Stadia developer on that very topic, please head over to https://www.realworldtech.com/forum/?threadid=189711&curpostid=189723 and https://www.realworldtech.com/forum/?threadid=189711&curpostid=189752 — datenwolf, Jan 28 '20 at 11:12
@Bodo `volatile` is not appropriate here. [**Why is volatile not considered useful in multithreaded C or C++ programming?**](https://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming) and [**Volatile: Almost Useless for Multi-Threaded Programming**](https://web.archive.org/web/20120210084400/http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/) If you don't agree, good luck arguing against Hans Boehm. — Andrew Henle, Jan 28 '20 at 11:20
@UlrichEckhardt It is so _tiresome_ to get that comment every single time this bug pops up. `volatile` is _not_ needed for re-entrancy in this case. It is needed because the compiler cannot deduct that the thread callback function is called and therefore optimizes the code incorrectly. It has nothing to do with threads. — Lundin, Jan 28 '20 at 12:53
[This](https://electronics.stackexchange.com/a/409570/6102) explains why volatile can be needed. The example is for embedded systems interrupt, but it's the very same deal with threads. It also addresses re-entrancy, while volatile does indeed not solve. — Lundin, Jan 28 '20 at 12:56
Thanks for the people pointing out the incorrect use of pthread_create(). I fixed it. — RichardSchorrig, Jan 28 '20 at 12:57
@RichardSchorrig: if all you need is a flag, use an [`atomic_bool`](https://en.cppreference.com/w/c/atomic). I.e. add `#include `, change the declaration to `atomic_bool condA;`, and use `atomic_load(&condA)`/`atomic_store(&condA, true)`. Note that `volatile` also doesn't prevent reordering with other non-volatile accesses. — vgru, Jan 28 '20 at 13:33
@AndrewHenle "_Hans Boehm points out that there are only three portable uses for volatile_" And he could not be more wrong. Neither signals nor MMIO are portable. There are many other portable uses. Good luck backing up his rant with actual arguments. — curiousguy, Jan 30 '20 at 23:10
@Bodo "the compiler _doesn't know what they will do_" And if it did, then what? Which guarantee could possibly exist? IOW, what do you claim the compiler does not know and yet could know that would solve a general problem? — curiousguy, Jan 30 '20 at 23:13
@curiousguy My remark refers to the OP's question: "Why does the compiler optimize the condition if it is obviously set in another thread?" If the compiler knew about threads and which functions are concurrent threads, then it should handle that `condA` can be modified by `thread_B` while `thread_A` is running. As it doesn't care about threads, the only obvious fact for the compiler is that `condA` is not modified during the normal control flow of `thread_A`. To inform the compiler that a variable can be modified from outside the normal control flow, you have to use `volatile`. — Bodo, Jan 31 '20 at 09:27
@curiousguy Yes, that's what I'm claiming for a **C compiler**. Threads are provided by the POSIX threads interface library (in the OP's case) and the OS, they are not part of the **C** language standard. (This does not apply to all languages. IIRC Java includes threads as part of the language standard.) see also https://stackoverflow.com/a/7774795/10622916 — Bodo, Jan 31 '20 at 12:55
@Bodo "_If the compiler knew about threads and which functions are concurrent threads,_" By knowing about thread starting function? How would that possibly work? Would that still allow separate compilation? — curiousguy, Jan 31 '20 at 18:06
@curiousguy The only relevant point is that the C compiler cannot do what the OP assumed in the question. I don't want to discuss how exactly a compiler would have to work to handle this automatically or what implication this would have on the C language or the compilation process. — Bodo, Feb 03 '20 at 11:05

score 0 · Accepted Answer · edited Jun 20 '20 at 09:12

the variable which has to be volatile, is not in your code

char condA = 0;

so the thread never reads it form the memory to check if the condition is met

.L3:
  cmp al, 1
  jne .L3

if you add the volatile keyword the code changes to the correct one

.L8:
        movzx   eax, BYTE PTR condA[rip]
        cmp     al, 1
        jne     .L8

gcc handles volatile correct. But volatile does not guarantee anything else especially no atomicity and coherency.

https://godbolt.org/z/dDTUz-

EDIT

The answer for the second question 2 is quite simple.

The compiler does not see ant way this flag can be changed when the program is executed (as there is not direct call to the thread function). No functions are called in the thread as well (as they can be side effects prone). So the compiler optimizes out the reads from the storage as not needed and keeps the value in the register(s).

Gcc optimizes condition

1 Answers1

EDIT