Outside of using volatile, how can I assure that I'm querying the latest value from memory?

Question

I understand that the compiler may choose to hold a value in cache, and that I can ensure that it reads the latest value from memory every time by using volatile, but are there other ways I can ensure that the latest value is being read without adding a type qualifier?

If you are concerned about making sure you have the correct value in a multithreaded program where the variable can be changed in another thread, you should be working with atomic variables. — Christian Gibbons, Jun 17 '22 at 16:46
@ChristianGibbons I don't think atomic will prevent it from using cache. — Barmar, Jun 17 '22 at 16:46
In C++, if you are doing multithreading `volatile` is **not** valid for this. You need a mutex, atomic variable or some other thread synchronization technique are the variable o avoid a data race. — NathanOliver, Jun 17 '22 at 16:46
@NathanOliver That still won't force it to bypass cache. It solves a different problem. — Barmar, Jun 17 '22 at 16:47
@Barman I'm not certain that using cache is the actual issue at hand, but rather correctness of the value. The question was tagged `multithreading`, so I'm working under the assumption that syncing across threads is the real issue. — Christian Gibbons, Jun 17 '22 at 16:49
@ChristianGibbons There are generally 2 issues with shared variables across threads: atomic access to the memory, and forcing cache or register updates from the memory. Mutexes solve the first problem, volatile the second. — Barmar, Jun 17 '22 at 16:51
Compiler does not choose to hold anything in cache, it is done on the MMU level. So there is no way to prevent it on a compiler level. — Eugene Sh., Jun 17 '22 at 16:52
@EugeneSh. But compilers generate code that copies memory into registers, and may not reload the register. — Barmar, Jun 17 '22 at 16:53
What's wrong with using `volatile`? This is exactly what it's for if I understand the question. — Barmar, Jun 17 '22 at 16:54
@Barmar And atomic variables, I believe, solve both problems. Sometimes they're implemented with a mutex, and sometimes they're implemented with atomic CPU instructions. — Christian Gibbons, Jun 17 '22 at 16:54
In addition to atomic access (avoiding data races between threads), you must ensure that for two threads A and B on different cpus (e.g. 1 and 2) that when A writes to a value in cpu 1, then, when B fetches the value, the _cache_ has been updated on cpu 2 to reflect A's action. `stdatomic.h` primitives will do whatever cache flush/sync is required by the architecture (e.g. `arm` needs the `dmb` instruction to sync cache). (e.g.) we have `int comm;` A should do: `atomic_store(&comm,23);` and B should do: `int local = atomic_fetch(&comm);` — Craig Estey, Jun 17 '22 at 16:54
@Barmar I just hear around that it's bad practice, and that I shouldn't be considering `volatile` outside of very specific situations like embedded programming. I thought wanting to make sure that the value I'm querying is correct considering other threads have modified it is a basic thing for concurrency. — John Friendson, Jun 17 '22 at 16:57
@JohnFriendson `volatile` is when something external from your program can change the value of a variable, like some sensor in an embedded system. If you are sharing a variable inside you program (multithreading) then you need something else. — NathanOliver, Jun 17 '22 at 17:02
`volatile` is useful in embedded systems to read the value of registers that can be changed outside of your control loop, such as registers tied to GPIO or other peripherals. It forces the compiler to read the value currently stored there, rather than assuming it hasn't changed because there's nothing in your code to change it. What it does not do, however, is provide atomic access, which is necessary for syncing across threads. — Christian Gibbons, Jun 17 '22 at 17:02
I believe this is an issue distinct from preventing data races and managing synchronization. `volatile` addresses the issue that values held in cache may not reflect values held in memory when values held in memory are being changed by different threads or in an unpredictable manner. May I just cast to `volatile` before accessing to ensure that it's the latest? — John Friendson, Jun 17 '22 at 17:08
What type of variables are you worried about, @JohnFriendson? If you're dealing with integers or other types that can be handled atomically, then using atomic types may be an answer. If you're dealing with character strings or structures, then you are probably forced to use mutexes or something similar. Memory barriers ([`pthread_barrier_init()`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_barrier_init.html) et al) may help — I've not seen them mentioned yet. Or they might be 'red herrings'. — Jonathan Leffler, Jun 17 '22 at 17:13
`volatile` has nothing to do with caches. The value will be stored in caches in almost all mainstream platforms anyway if it is volatile (otherwise it can be stored in registers). `volatile` prevent the compiler to store the value in a register: it needs to read/write the values every time as it may have changed. Working on data modified by other threads causes an undefined behaviour. Previous comments are pretty clear: for a multithreaded program, use atomic variables/operations and for an embedded program communicating with devices (or anything external actor like a debugger) use `volatile`. — Jérôme Richard, Jun 17 '22 at 17:22
`volatile` prevents re-using a value that the compiler optimizer has cached in a CPU register. It was not meant for dealing with values held in a CPU memory cache. 2 different applications of the word cache. It was designed & intended for use in accessing memory mapped peripherals in low level/embedded software. It was not intended for dealing with multi-threading issues. It was (ab)used for multi-theading before alternatives were provided. It has no place today in dealing with what you are trying to deal with. We now have stuff designed for that purpose. — Avi Berger, Jun 17 '22 at 17:57
Does this answer your question? [When to use volatile with multi threading?](https://stackoverflow.com/questions/4557979/when-to-use-volatile-with-multi-threading) — Avi Berger, Jun 17 '22 at 18:09
Yet another phrasing: `volatile` indicates that the act of reading or writing to the variable may have an effect in itself, and must not be optimized away, or done more times than the source code indicates. — hyde, Jun 17 '22 at 18:11

score 3 · Answer 1 · answered Jun 18 '22 at 00:24

You really need to change your concept of what volatile means. The easiest way to think of volatile is as "don't optimize this".

Every action taken on a volatile must be observable. Which means it will always have to be read from "memory" and written back to "memory". But this was defined way before there was such a thing a caches and volatile has no affect on what the hardware does. It only governs what the compiler does. It only forces the compiler to leave every read of the variable and every write of the variable in the output and in the right order regarding other actions. It doesn't change how the variable is accessed, only that it is. In fact on modern hardware you have to combine access to a volatile either with specialized page table entries that negate caches for the most part or add memory barriers and cache flushes to make it work right.

What you actually need for multithreading is std::atomic. This includes all the necessary logic to deal with different memory models on different architectures.

score 3 · Accepted Answer · answered Jun 18 '22 at 02:01

Outside of using volatile, how can I assure that I'm querying the latest value from memory?

You can't be assured that you're actually accessing memory, at least not in a portable way. Even if you use std::atomic or atomic variables (e.g. atomic_int in C) there is no guarantee that the value will come from to memory and not cache.

There are 4 cases:

it's not atomic, so there's no guarantee at all
it is atomic and the target platform isn't cache coherent (e.g. some ARM CPUs) and the compiler probably had to ensure the data came from memory as it's the only way to ensure atomicity.
it is atomic and the target platform is cache coherent (e.g. 80x86 CPUs) and therefore you probably have no reason to care if the data came from cache or memory in the first place
you actually do care if the data came from cache or memory (e.g. you're writing a tool to benchmark RAM bandwidth, or test for faulty RAM). In this case you're going to have to resort to non-portable tricks - exploiting target specific cache eviction policy, using inline assembly, asking the OS to make the memory "uncached", etc.

Outside of using volatile, how can I assure that I'm querying the latest value from memory?

2 Answers2