12

I was reading the answer to this question regarding the volatile keyword:

https://stackoverflow.com/a/2485177/997112

The person says:

The solution to preventing reordering is to use a memory barrier, which indicates both to the compiler and the CPU that no memory access may be reordered across this point. Placing such barriers around our volatile variable access ensures that even non-volatile accesses won't be reordered across the volatile one, allowing us to write thread-safe code.

However, memory barriers also ensure that all pending reads/writes are executed when the barrier is reached, so it effectively gives us everything we need by itself, making volatile unnecessary. We can just remove the volatile qualifier entirely.

How is this "memory barrier" implemented in C++?

EDIT:

Could someone give a simple code example please?

ks1322
  • 33,961
  • 14
  • 109
  • 164
user997112
  • 29,025
  • 43
  • 182
  • 361
  • @HansPassant there is no simple example of a C++ memory barrier in the question you linked to – user997112 Jul 27 '13 at 13:39
  • Who promised it was going to be simple? This is C++, it is supposed to be hard. If it wasn't then anybody could be a C++ programmer :) At least the word "Memory Barrier" in the question title ought to be a hint that it is the exact same question. – Hans Passant Jul 27 '13 at 13:54

3 Answers3

12

Memory barriers are trivial to use in C++11:

std::atomic<int> i;

All access to i will be protected by memory barriers.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
  • Indeed, still recommend this video imo https://youtu.be/ZQFzMfHIxng ;) – jaques-sam Oct 07 '21 at 09:56
  • ... unless you specifically don't care about an access being ordering wrt. other operations on other objects, in which case you can `i.load(std::memory_order_relaxed)`. (The default is `seq_cst`, but usually `acquire` and `release` are all that's needed, and can be significantly cheaper, especially for stores on x86) – Peter Cordes Nov 17 '21 at 06:15
8

This is very hardware-dependent. From the fairly long documentation of memory barrier of the Linux kernel:

The Linux kernel has eight basic CPU memory barriers:

TYPE                MANDATORY               SMP CONDITIONAL
===============     ======================= ===========================
GENERAL             mb()                    smp_mb()    
WRITE               wmb()                   smp_wmb()
READ                rmb()                   smp_rmb()   
DATA DEPENDENCY     read_barrier_depends()  smp_read_barrier_depends()

Let's take one of them in particular: smp_mb(). If you open asm/x86/um/asm/barrier.h, you will find that when CONFIG_SMP is defined,

#define smp_mb()    mb()

And if you scroll up, you can see that depending on the platform, mb has different implementations:

// on x86-32
#define mb()        alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
// on other platforms
#define mb()        asm volatile("mfence" : : : "memory")

More information on the differences between these 2 things have been discussed in this thread. I hope this helps.

Community
  • 1
  • 1
qdii
  • 12,505
  • 10
  • 59
  • 116
  • are these system calls? So the programmer cannot create the memory barrier themself? – user997112 Jul 27 '13 at 13:39
  • 5
    they are not system calls (btw, you can find all system calls by running `man 2 syscalls`). The c++ compiler will replace your `mb()` call with the corresponding ASM instruction in the compiled code. You could create a memory barrier yourself, which would consist in calling some assembly instruction, just like the linux source code does. – qdii Jul 27 '13 at 13:44
2

Typically, there are "intrinsic functions" - these are special functions that the compiler has special knowledge as to how they operate (in particular that they are memory barriers). The names vary from compiler to compiler (and sometimes for different architectures of the same compiler).

For example, MSVC uses _ReadBarrier, WriteBarrier and _ReadWriteBarrier

In x86 it would produce an lfence, sfence or mfence instruction - which, respectively, does "load", "store" and "all memory operations" barriers - in other words, an lfence will be a barrier for memory read operations, an sfence will be a "memory write" barrier, and mfence will be a barrier against both read and write operations.

ZXX
  • 4,684
  • 27
  • 35
Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • That's not actually what `lfence` does; its only memory-ordering effect is to block reordering of SSE4.1 movntdqa loads from weakly-ordered WC memory (e.g. video RAM) from reordering with later loads/stores. It also blocks OoO exec of later instructions from starting until all earlier instructions have completed *executing* (but for stores, not necessarily committed from the store buffer). x86 already disallows memory reordering for normal loads/stores other than StoreLoad, so compilers can get LoadLoad and LoadStore ordering just by not doing compile-time reordering. – Peter Cordes Nov 17 '21 at 06:26
  • The MSDN links are dead, but those intrinsics *don't* emit those instructions; it seems they only block compile-time reordering. (And some of them make the compiler not do constant-propagation, like GNU C `asm("" ::: "memory")`, so it actually reloads.) https://godbolt.org/z/Gneborsxf shows no asm instructions, not even for `_ReadWriteBarrier` in MSVC 19.28 -O2 (x86 or x64). See also [When should I use \_mm\_sfence \_mm\_lfence and \_mm\_mfence](https://stackoverflow.com/a/50780314) for actual intrinsics for those instructions, and why you normally only want compiler barriers. – Peter Cordes Nov 17 '21 at 06:30
  • 1
    @PeterCordes is correct. The _Read[Write]Barrier intrinsics are compiler barriers, not memory barriers. The Win32 intrinsic for a memory barrier is MemoryBarrier. The docs (https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-memorybarrier) offer a clarification on the former functions: - `The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler intrinsics prevent compiler re-ordering only.` – Mark Ingram Mar 05 '22 at 12:03
  • @MarkIngram - what is intrinsic/system memory barrier on Linux x64? – Boppity Bop Apr 03 '22 at 17:37