Is memory barrier needed in this situation or just a volatile

Question

I'm reading this article, and I follow the author's steps but get a different result.

I create two threads. One is reader, and one is writer.

// volatile uint64_t variable1 = 0; <- global
// uint64_t* variable2_p = new uint64_t(0); <- in main function
// const unsigned ITERATIONS = 2000000000; <- global

void *reader(void *variable2) {
    volatile uint64_t *variable2_p = (uint64_t *)variable2;
    // bind this thread to CPU0

    unsigned i, failureCount = 0;
    for (i=0; i < ITERATIONS; i++) {
            uint64_t v2 = *variable2_p;
            uint64_t v1 = variable1;
            if (v2 > v1) {
                failureCount++;
                printf("v1:%" PRIu64 ", v2:%" PRIu64 "\n", v1, v2);
            }
    }
    printf("%u failure(s)", failureCount);
    return NULL;
}

void *writer(void *variable2) {
    volatile uint64_t *variable2_p = (uint64_t *)variable2;
    // bind this thread to CPU1

    for (;;) {
        variable1 = variable1 + 1;
        *variable2_p = (*variable2_p) + 1;
    }
    return NULL;
}

In the article above, the author said that the compare v2 <= v1 may fail for some time because the compiler or the processor may change the execution order.

But I tried so many times, there isn't any failure cases. I'm confused that is that right to use only volatile is this situation? Or it will lead to some delicate bugs?

If it isn't OK, please give me a example. Thanks a lot.

compile command: g++  -O2 -Wall -g -o foo foo.cc -lpthread
uname -a: Linux Wichmann 3.5.0-48-generic #72~precise1-Ubuntu SMP Tue Mar 11 20:09:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
cpuid: Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz

with multithreading analysis is the only dependable tool. that said, a given platform may offer additional guarantees for e.g. `volatile`. — Cheers and hth. - Alf, Apr 29 '14 at 08:23
Did you read the entire article, including the discussion about how the fact that you're not observing any bugs *now* doesn't mean that there aren't any? — molbdnilo, Apr 29 '14 at 08:29
@JonathanWakely: Can you please explain why you say `volatile is not for multithreading` — arunmoezhi, Apr 29 '14 at 08:40
@arunmoezhi, because it isn't. It is for interfacing with external hardware, not for multithreading. Can you explain why you think it might be for multithreading? Read http://www.drdobbs.com/parallel/volatile-vs-volatile/212701484 for a full explanation. — Jonathan Wakely, Apr 29 '14 at 08:43
@JonathanWakely: No offence/sarcasm intended. I was just curious to know more about what you claimed. — arunmoezhi, Apr 29 '14 at 08:50
@arunmoezhi, http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2006/n2016.html is also informative — Jonathan Wakely, Apr 29 '14 at 08:58

score 3 · Accepted Answer · answered Apr 29 '14 at 08:37

It may fail doesn't mean that it will fail, on all machines. If you're running on a single core, for example, it probably won't fail. If you're running on a multicore Alpha, it almost certainly will fail some of the time. On other machines, the results will vary, depending on any number of things.

As for volatile, it offers no guarantees for multi-threaded code. It may be necessary if you're also using inline assembler, or other things the compiler can't understand, but otherwise: any time you do what ever else is necessary to ensure thread safety, you don't need volatile. In particular, if you use the C++11 atomic types or threading primitives, volatile is never necessary.

jblume · Answer 2 · 2014-04-29T09:06:36.007

2

Edit

Actually, in this very specific case the code is correct because both variable1 is volatile and the pointer variable2_p is marked as pointer to a volatile. This enforces ordering of memory access.

Volatile is misused so often that I jumped the gun here, sorry.

Old answer:

Using volatile only guarantees two things:

Reads and writes happen to the actual memory address where the variable resides and are not optimized away or kept in registers
Sequential accesses to volatile variables are not reordered

You asked for an example, but you have given it yourself in your question:

variable1 = variable1 + 1;
*variable2_p = (*variable2_p) + 1;

This could be reordered, leading to the failure in the other thread. That this doesn't happen in your specific environment is irrelevant. The compiler is allowed to do it so the code is not correct.

edited Apr 29 '14 at 09:06

answered Apr 29 '14 at 08:40

jblume

390
1
6

Thanks for answer. As you said "Sequential accesses to volatile variables are not reordered", and ``variable1`` and ``variable2_p`` are both ``volatile``. You mean I should use a ``volatile uint64_t* volatile variable2_p`` to avoid the potential problem? – Inapt Apr 29 '14 at 08:49
Sorry, I didn't notice that variable1 is volatile, too. In this special case, the code is actually correct. I will correct my answer. – jblume Apr 29 '14 at 09:02
`Using volatile only guarantees two things:` - except that it absolutely does **not** guarantee #2. The guarantee is more `the *compiler* won't reorder sequential accesses to volatile variables, but hey the processor may do whatever it wants` which is clearly not the same! That's also why the code still isn't correct. – Voo Apr 29 '14 at 11:04
@Voo Please give an example how processor will reorder the sequential accesses to volatile variables. It puzzles me a lot. Thanks. – Inapt Apr 30 '14 at 03:31
1

@Inapt That's a *way* too large topic for a comment. Suffice to say that modern processors are way more complicated than "executed one instruction after the other", it can reorder independent instructions (out of order execution), speculative execution, there are things like the store-buffer, the cache protocol (a write that goes to a shared cacheline needs more work than one that goes to a modified one in MESI) and so on. The only thing that volatile guarantees is that the compiler generates the load/stores in order, it completely ignores everything else. – Voo Apr 30 '14 at 11:48

Is memory barrier needed in this situation or just a volatile

2 Answers2