is `memory_order_relaxed` necessary to prevent partial reads of atomic stores

Question

Suppose thread 1 is doing atomic stores on a variable v using memory_order_release (or any other order) and thread 2 is doing atomic reads on v using memory_order_relaxed.

It should be impossible to have partial reads in this case. An example of partial reads would be reading the first half of v from the latest value and the second half of v from the old value.

If thread 2 now reads v without using atomic operations, can we have partial reads in theory?
Can we have partial reads in practice? (Asking because I think this shouldn't matter on most processors, but I'm not sure.)

Peter Cordes · Accepted Answer · 2019-10-20T18:42:31.370

2

For 1. how do you propose doing that?

atomic<T> v is a template that overloads the T() implicit conversion to be like .load(mo_seq_cst). That makes tearing impossible. seq_cst atomic is like relaxed plus some ordering guarantees.

The template also overloads operators like ++ to do an atomic .fetch_add(1, mo_seq_cst). (Or for pre-increment, 1+fetch_add to produce the already-incremented value).

Of course, if you look at the bytes of the object-representation of atomic<T> by reading it with non-atomic char* (e.g. with memcpy(&tmp, &v, sizeof(int)), that's UB if another thread is modifying it. And yes you can get tearing in practice depending on how you do it.

More likely for objects too large to be lock-free, but possible on some implementations e.g. for 8-byte objects on a 32-bit system which can implement 8-byte atomicity with special instructions, but normally will just use two 32-bit loads.

e.g. 32-bit x86 where an atomic 8-byte load can be done with SSE and then bouncing that back to integer regs. Or lock cmpxchg8b. Compilers don't do that when they just want two integer registers.

But many 32-bit RISCs that provide atomic 8-byte loads have a double-register load that produces 2 output registers from one instruction. e.g. ARM ldrd or MIPS ld. Compilers do use these to optimize aligned 8-byte loads even when atomicity isn't the goal, so you'd probably "get lucky" and not see tearing anyway.

Small objects would typically happen to be atomic anyway; see Why is integer assignment on a naturally aligned variable atomic on x86?

Of course the non-atomic access wouldn't assume that the value could change asynchronously, so a loop could use a stale value indefinitely. Unlike a relaxed atomic, which on current compilers is like volatile in that it always re-accesses memory. (Via coherent hardware cache of course, just not keeping the value in a register.)

edited Oct 20 '19 at 18:42

answered Oct 20 '19 at 18:15

Peter Cordes

328,167
45
605
847

Sorry I have missed that part. OTOH I think this is possible to do in C, which also has `stdatomic`. Would the situation be similar to "reading with non-atomic `char*`" then? – user869887 Oct 20 '19 at 18:28
And it sounds like for small objects (e.g. words), tearing would be unlikely in practice? – user869887 Oct 20 '19 at 18:29
@user869887: no, `_Atomic int` in C11 effectively overloads read/write and RMW operators the same way, but inside the compiler because the language itself doesn't support that kind of thing so it can't be a library. Similar to how `volatile` makes accesses special. – Peter Cordes Oct 20 '19 at 18:39
@user869887: right, small objects would tend to be atomic unless the `memcpy` actually compiled to separate byte loads. Added a link in the answer – Peter Cordes Oct 20 '19 at 18:43
1

@user869887 -- when you're writing multi-theaded code, "unlikely in practice" means "will happen, probably when you're giving a demo to your most valuable customer". – Pete Becker Oct 20 '19 at 19:27
@PeterCordes -- "won't tear" isn't the same as atomic. It's one of the **three** things that C++ atomics guarantee. See [this answer](https://stackoverflow.com/questions/14624776/can-a-bool-read-write-operation-be-not-atomic-on-x86/14625122#14625122). – Pete Becker Oct 20 '19 at 19:29
@PeteBecker: lol, yeah. But in this case it's a compile-time decision. OTOH it could depend on surrounding code, so a unit test might find no tearing but the asm for the real use-case might be unsafe and harder to test. Fortunately the OP isn't suggesting you'd ever do this, just wondering what *could* happen. – Peter Cordes Oct 20 '19 at 19:30
@PeteBecker: See last paragraph of this answer. Note "won't tear" is basically the definition of atomicity. But yes, `std::atomic` and C11 `_Atomic` give you *more* than just atomicity. `mo_relaxed` gives you no ordering. But it does always give you the third thing you mentioned: being sort of like `volatile` and assuming that other threads may have changed the value, so you can't CSE multiple reads. (The standard doesn't actually rule that out, but compilers don't for QOI reasons: [Why don't compilers merge redundant std::atomic writes?](//stackoverflow.com/q/45960387)). – Peter Cordes Oct 20 '19 at 19:35
@PeteBecker I agree. Not going to cut corners here. Just checking my understanding : ) – user869887 Oct 20 '19 at 19:36
@PeterCordes Just to double check, if I do some operation on an atomic variable without specifying the ordering (e.g. `v++`), then `memory_order_seq_cst` is used in both C/C++? What if `v` is a pointer and an assignment is done (`v = ptr`)? I imagine this is the case as non-explicit functions use `memory_order_seq_cst` by default, and the first part of your answer seems to suggest the same. – user869887 Oct 20 '19 at 23:19
@user869887: yes, seq_cst is the default for everything; you have to explicitly specify a weaker memory order if you want less ordering. `atomic` doesn't make anything different; it's still an atomic seq-cst assignment of the pointer *value* to the atomic pointer object. Or do you mean `atomic *v`? In that case it's just a normal assignment of a pointer *to* an atomic object, and `v` is not itself atomic. – Peter Cordes Oct 20 '19 at 23:33
@PeterCordes I meant `atomic`. Sorry for the confusion and thanks again! – user869887 Oct 20 '19 at 23:37
1

In C++20, you could do what the OP suggests (access a value both atomically and not) with `std::atomic_ref`. – BeeOnRope Oct 21 '19 at 17:24
1

@BeeOnRope: Yup. I was prepared for their answer to "how do you propose doing that" to be "with `std::atomic_ref`", but since not it might have been more confusing. Side note, I think `std::atomic_ref` is going to introduce a lot of confusion on ABIs where `alignof(T) < alignof(atomic)` for types like `uint64_t`; people will just expect that they can atomic_ref an arbitrary `uint64_t` but if the implementation isn't careful that can introduce tearing in loads/stores. related: GCC still hasn't fixed their C11 `_Atomic uint64_t` to enforce 8-byte alignment inside structs. – Peter Cordes Oct 21 '19 at 18:13

is `memory_order_relaxed` necessary to prevent partial reads of atomic stores

1 Answers1