Re-ordering Atomic Reads

Question

I am working on a multithreded algorithm which reads two shared atomic variables:

std::atomic<int> a(10);
std::atomic<int> b(20);

void func(int key) {
   int b_local = b;
   int a_local = a;
   /* Some Operations on a & b*/
}

The invariant of the algorithm is that b should be read before reading a.

The question is, can compiler(say GCC) re-order the instructions so that a is read before b? Using explicit memory fences would achieve this but what I want to understand is, can two atomic loads be re-ordered.

Further, after going through Acquire/Release semantics from Herb Sutter's talk(http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/), I understand that a sequentially consistent system ensures an ordering between acquire(like load) and release(like store). How about ordering between two acquires(like two loads)?

Edit: Adding more info about the code: Consider two threads T1 & T2 executing:

T1 : reads value of b, sleeps

T2 : changes value of a, returns

T1 : wakes up and reads the new value of a(new value)

Now, consider this scenario with re-ordering:

int a_local =a; int b_local = b;

T1 : reads value of a, sleeps

T2 : changes value of a, returns

T1 : Doesn't know any thing about change in value of a.

The question is "Can a compiler like GCC re-order two atomic loads`

I'm unclear what the question is here - ordering of reads shouldn't matter as it doesn't cause modification of the value. Reordering of reads seems perfectly reasonable in this case. — Krease, Jan 01 '16 at 08:24
Based on that edit, `b` is irrelevant - Without some form of synchronization, T1 can read `a` before or after T2 modifies it. For the more general reordering question, I'm not totally sure on C++ semantics, but if these were `volatile` in Java, they [would not be reordered](http://stackoverflow.com/questions/9187527/volatile-why-prevent-compiler-reorder-code). Hope this helps. — Krease, Jan 01 '16 at 09:06
http://stackoverflow.com/questions/8819095/concurrency-atomic-and-volatile-in-c11-memory-model probably has the answer to your question. — Mat, Jan 01 '16 at 09:10

score 1 · Answer 1 · answered Jan 01 '16 at 09:52

1

Yes, they can be reordered since one order is not different from the other and you put no constraints to force any particular order. There is only one relationship between these lines of codes: int b_local = b; is sequenced before int a_local = a; but since you have only one thread in your code and 2 lines are independent it is completely irrelevant which line is completed first for the 3rd line of code(whatever that line might be) and, hence compiler might reorder it without a doubt.

So, if you need to force some particular order you need:

2+ threads of execution
Establish a happens before relationship between 2 operations in these threads.

answered Jan 01 '16 at 09:52

ixSci

13,100
5
45
79

By defining `a` and `b` `std::atomic`, the reads to `b_local` and `a_local` are by default sequentially consistent atomic reads/loads. So, isn't the order forced by default. Also, the method `func` is accessed by multiple threads like shown in example interleavings. – Varun V Jan 01 '16 at 09:57
@VarunV, order is not forced by default it gets established with the help of relationships. But since the 2 lines of code you have are independent and have no forced relationship on them then whatever order you have it won't change the program. My explanation is a bit clumsy, I know. The thing is: there is no rule by which `a` should be read before `b` so a compiler has full freedom in reordering. – ixSci Jan 01 '16 at 10:01
@VarunV, 2 acquire operations don't form any relationship so in order to establish one you need a write operation. – ixSci Jan 01 '16 at 10:11
Okay. The code `int b_local = b ; int a_local = a;` is equivalent to `int b_local = b.load(std::memory_order_seq_cst) ; int a_local = a.load(std::memory_order_seq_cst)` . If this is the case, then just I found this link(http://stackoverflow.com/questions/6319146/c11-introduced-a-standardized-memory-model-what-does-it-mean-and-how-is-it-g) where there is an example on store ordering. If that is true, then I don't think the compiler can re-order them. – Varun V Jan 01 '16 at 10:13
1

@VarunV, the code in the question you linked contains store operations and yours doesn't and without them there is no relationship. You should update your post with the relevant store operations to make it possible to reason about it – ixSci Jan 01 '16 at 10:18
Okay. So, it is like a relation is established among an "acquire and release" or rather "load and store". Therefore, it is quite possible that two loads can be re-ordered. Is this what you mean? – Varun V Jan 01 '16 at 10:23
@VarunV, well as far as I understand it — yes. I mean, there is certainly no relationship between these lines which would force it and I see no rule in the standard which would prevent a compiler from doing a reordering in this case – ixSci Jan 01 '16 at 10:37
@VarunV, once again: show how you modify these variables and then we can calculate what sequences are possible – ixSci Jan 01 '16 at 10:38
@ixSci You are definitely correct here. Real world systems do in fact re-order these reads in cases where they can prove that it will cause no data races in code that didn't already have them. In fact, modern x86 *CPU's* do this internally. – David Schwartz Sep 27 '16 at 09:05

score 1 · Accepted Answer · answered Jan 01 '16 at 17:22

1

Description of memory_order_acquire:

no memory accesses in the current thread can be reordered before this load.

As default memory order when loading b is memory_order_seq_cst, which is the strongest one, reading from a cannot be reordered before reading from b.

Even weaker memory orders, as in the code below, provide same garantee:

int b_local = b.load(std::memory_order_acquire);
int a_local = a.load(std::memory_order_relaxed);

answered Jan 01 '16 at 17:22

Tsyvarev

60,011
17
110
153

What about the "as if" rule? Why can't it reorder the reads under the "as if" rule? (And there are most definitely real world platforms that do this!) – David Schwartz Sep 27 '16 at 09:03
@DavidSchwartz: I am not sure that modern hardware may `prove that it will cause no data races in code` in multicore case. Possibly it can read `a` into *shadow register*, then read `b` and then check that (inter-cpu) cache with `a` is still valid. If check is successfull, it may store shadow register with `a` into real register/memory. In any case, "as if" means that **program has never perceived the opposite behavior**, so I see no needs to note about this. – Tsyvarev Sep 27 '16 at 09:35
Except that's what the question asked. – David Schwartz Sep 27 '16 at 10:09
`The question is "Can a compiler like GCC re-order two atomic loads"` - compiler cannot reorder these loads (in a way obesrvable by the program). – Tsyvarev Sep 27 '16 at 10:21
And the correct answer is yes, it can, under the as if rule. And, in fact, on real common hardware it does, because it's more efficient not to force the ordering (and fix things up if that creates a problem) rather than to force the ordering. However, it must appear to the program as if they were not reordered. On x86, you can force the ordering, but GCC does not do so because it's not required, thanks to the as if rule. (Look at the generated assembly. There will be no fences or other forces, even though x86 has them. That permits the CPU to prefetch out of order which it in fact does do.) – David Schwartz Sep 27 '16 at 10:23
Why program (and programmer) should care about internals of the hardware? Programming with atomics is very difficult by itself, lets programmer to not bother about things which cannot affect his product. Sidenote: the question is about **compiler**, not a *hardware*. – Tsyvarev Sep 27 '16 at 10:33
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/124331/discussion-between-david-schwartz-and-tsyvarev). – David Schwartz Sep 27 '16 at 16:56

score -1 · Answer 3 · answered Jan 01 '16 at 09:14

Here's what __atomic_base is doing when you call assignment:

  operator __pointer_type() const noexcept
  { return load(); }

  _GLIBCXX_ALWAYS_INLINE __pointer_type
  load(memory_order __m = memory_order_seq_cst) const noexcept
  {
    memory_order __b = __m & __memory_order_mask;
    __glibcxx_assert(__b != memory_order_release);
    __glibcxx_assert(__b != memory_order_acq_rel);

    return __atomic_load_n(&_M_p, __m);
  }

As per the GCC docs on builtins like __atomic_load_n:

https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

"An atomic operation can both constrain code motion and be mapped to hardware instructions for synchronization between threads (e.g., a fence). To which extent this happens is controlled by the memory orders, which are listed here in approximately ascending order of strength. The description of each memory order is only meant to roughly illustrate the effects and is not a specification; see the C++11 memory model for precise semantics.


__ATOMIC_RELAXED
    Implies no inter-thread ordering constraints.
__ATOMIC_CONSUME
    This is currently implemented using the stronger __ATOMIC_ACQUIRE memory order because of a deficiency in C++11's semantics for memory_order_consume.
__ATOMIC_ACQUIRE
    Creates an inter-thread happens-before constraint from the release (or stronger) semantic store to this acquire load. Can prevent hoisting of code to before the operation.
__ATOMIC_RELEASE
    Creates an inter-thread happens-before constraint to acquire (or stronger) semantic loads that read from this release store. Can prevent sinking of code to after the operation.
__ATOMIC_ACQ_REL
    Combines the effects of both __ATOMIC_ACQUIRE and __ATOMIC_RELEASE.
__ATOMIC_SEQ_CST
    Enforces total ordering with all other __ATOMIC_SEQ_CST operations. "

So, if I'm reading this right, it does "constrain code motion", which I read to mean prevent reordering. But I could be misinterpreting the docs.

He's not asking what the code happens to do on some particular platform. He's asking what the language says this code can or cannot do. — David Schwartz, Sep 26 '16 at 10:38

score -1 · Answer 4 · answered Jan 01 '16 at 09:22

Yes, I think it can do reorder in addition to several optimizations. Please check the following resources: Atomic vs. Non-Atomic Operations

In case you still concern about this issue, try to use mutexes which for sure will prevent memory reordering.

Re-ordering Atomic Reads

4 Answers4

Linked