At least in C++11 (or later), you don't need to (explicitly) protect your variable with a mutex or memory barriers.
You can use std::atomic
to create an atomic variable. Changes to that variable are guaranteed to propagate across threads.
std::atomic<int> a;
// thread 1:
a = 1;
// thread 2 (later):
std::cout << a; // shows `a` has the value 1.
Of course, there's a little more to it than that--for example, there's no guarantee that std::cout
works atomically, so you probably will have to protect that (if you try to write from more than one thread, anyway).
It's then up to the compiler/standard library to figure out the best way to handle the atomicity requirements. On a typical architecture that ensures cache coherence, it may mean nothing more than "don't allocate this variable in a register". It could impose memory barriers, but is only likely to do so on a system that really requires them.
On real world C++ implementations where volatile
worked as a pre-C++11 way to roll your own atomics (i.e. all of them), no barriers are needed for inter-thread visibility, only for ordering wrt. operations on other variables. Most ISAs do need special instructions or barriers for the default memory_order_seq_cst
.
On the other hand, explicitly specifying memory ordering (especially acquire
and release
) for an atomic variable may allow you to optimize the code a bit. By default, an atomic uses sequential ordering, which basically acts like there are barriers before and after access--but in a lot of cases you only really need one or the other, not both. In those cases, explicitly specifying the memory ordering can let you relax the ordering to the minimum you actually need, allowing the compiler to improve optimization.
(Not all ISAs actually need separate barrier instructions even for seq_cst
; notably AArch64 just has a special interaction between stlr
and ldar
to stop seq_cst stores from reordering with later seq_cst loads, on top of acquire and release ordering. So it's as weak as the C++ memory model allows, while still complying with it. But weaker orders, like memory_order_acquire
or relaxed
, can avoid even that blocking of reordering when it's not needed.)