Writing once and reading many times simultaneously... without locking

Question

This is a question about one writer and multiple simultaneous readers.

I expect this to ruffle some feathers and I'll probably get downvoted just for daring to ask this question, but I want to understand how it works. I know what mutexes and atomics are, no need to educate me on that.

Let's say I have a memory location accessible to multiple threads (a global variable, or a pointer I shared around.) It's the same size as the architecture, let's say its a single unsigned integer of 8 bytes size on a 64-bit system. It's set to 0.

Let's assume I have a thousand threads reading it in a loop. If its 0 they do some important thing, and if its 1 they do another important thing, and if its neither 0 nor 1 they launch a nuclear missile.

Then one thread (exactly one, not multiple) overwrites this memory location with the value 1. What happens?

See... my theory is that nothing bad happens and that this is okay. There is no data corruption. There is nothing half way in between a 0 and a 1. One cycle it says 0 and the next it says 1. No need for a mutex or an atomic. Am I right? And if not, why?

EDIT: The system asked me to explain why this is not the same as another question. The answer is because it's not the same as that question. If you're not sure how it's different, please read it again, particularly the parts that end with a question mark.

C or C++? These are two different languages with entirely different approaches to this. — tadman, Mar 22 '21 at 13:14
"Let's assume I have a thousand threads reading it in a loop." What if we didn't because that's completely outside the realm of possibility. — tadman, Mar 22 '21 at 13:15
@tadman I'm not asking you how to solve a problem, I'm asking whether I'm right. Technically it's not a C or a C++ question, it's an architecture question. — Alasdair, Mar 22 '21 at 13:16
The behaviour here depends entirely on your ISA. It is not a function of C or C++ as far as I know, neither standard dictates what happens during a write vs. how other CPU cores perceive the data. On some huge NUMA systems it takes quite a while for a write to be reflected on all other cores, while on smaller systems there's cache coherency logic that kicks in and behaviour is much more predictable. There is no one answer here. — tadman, Mar 22 '21 at 13:17
_"Technically it's not a C or a C++ question, it's an architecture question."_ @Alasdair if this is an architecture problem, _what architecture_? — Drew Dormann, Mar 22 '21 at 13:17
The answer to this is **use atomic writes** or there's no guarantee of anything consistent happening at all. — tadman, Mar 22 '21 at 13:17
If the answer is "nothing bad happens but the different threads may not get the updated value immediately due to caching or this or that" that's a good answer. The answer is not to use atomic writes because I didn't pose a problem to you. — Alasdair, Mar 22 '21 at 13:18
It's not a matter of "what's there at the time of reading" but more about "What the `read` returns when the memory is busy", which depending on how your memory works, could be very different from `0` or `1` — Adalcar, Mar 22 '21 at 13:19
@Adalcar, yes this is my question: it can be different from 0 or 1..? How so? — Alasdair, Mar 22 '21 at 13:20
In C++ the behavior of that program is undefined. Full stop. Note that "undefined behavior" means that the language definition does not tell you what the program does. So you can guess all you like, or you can analyze the underlying hardware and perhaps sort out what you think will happen. But the compiler is not required to generate sensible code when the behavior is undefined, and it is not required to generate the same code every time you compile it, especially if you change compiler settings or compiler versions. There is, simply, no guarantee that the behavior be consistent. — Pete Becker, Mar 22 '21 at 13:25
As @DrewDormann mentionned, it is an architecture question: it would depend on how well your memory controller handles race conditions, how your kernel handles read errors and how your compiler treats those. This is why it's undefined behaviour: You're not supposed to do this because the return value could be a specific error from some obscure memory controller, and the C standard does not define those cases. — Adalcar, Mar 22 '21 at 13:28
Does this answer your question? [C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?](https://stackoverflow.com/questions/6319146/c11-introduced-a-standardized-memory-model-what-does-it-mean-and-how-is-it-g) — Jeffrey, Mar 22 '21 at 13:28
Of course in most cases any sensible compiler/kernel/memory controller will handle it properly, though the data will probably not carry to the other threads as said in the answer, it will not be something other than 0 or 1. **However, there is no guarantee** because it is not defined in the standard, and supposing you're working with very specific material/systems that didn't bother to implement that, you don't know what can happen. Your situation can never happen because any piece of code that gets close enough to secure stuff (like atomic bombs) is allowed exactly zero undefined behaviors. — Adalcar, Mar 22 '21 at 13:36
This question needs focus. It is asking about specific behavior of _some computer-like architecture_, real or imagined, current or future. Related to C and C++ in some unspecified way. — Drew Dormann, Mar 22 '21 at 13:51

score 2 · Answer 1 · answered Mar 22 '21 at 13:27

Then one thread (exactly one, not multiple) overwrites this memory location with the value 1. What happens?

See... my theory is that nothing bad happens and that this is okay

You are mistaken.

If one thread writes to a memory location and you have no synchronization mechanisms whatsoever and didn't use atomics, you have no guarantees that the other threads will see the change, ever.

There is a whole stack of technology running under your variable = 1; and pretty much every layer could swallow that update.

At the assembly level and at the CPU level, there is nothing forcing a write from the level 1 cache back to main memory.

If you have just one variable and just one change, it's not too bad. Some threads will see the updated values, some won't. It might take an arbitrarily long time for the update to propagate. You can still call that eventual consistency.

But as soon as you have two variables, there will be inconsistencies between the written and read values.

Put "memory model consistency C++" in your favorite search engine for more details.

That's only half of his question, though. the more important part is, "could it go through any other value than 0 and 1" — Adalcar, Mar 22 '21 at 13:29
Cache protocols (eg. MESI) specifically do propagate the values, presuming you are on a cache coherent machine. If you are using threads, you are on a cache coherent machine. So, the value will (eventually) propagate, and will only be a zero or one. — mevets, Mar 22 '21 at 14:10

score 2 · Answer 2 · answered Mar 22 '21 at 13:40

2

Threads usually don't read/write directly from/to memory, but to their cache. The cache is a smaller, but much faster memory.

For one of your reader threads to read a 1 instead of a 0, two things need to happen first:

The writer thread must have written its cache back to memory.
The reader thread must have reloaded its cache from memory.

When these events happen, depends on the architecture, OS, other processes and your code.

Each of your reader threads will read 0 until they read 1. The switch from 0 to 1 might not happen at all and it can happen at different times for different threads.

This is what atomics are there for, they can enforce synchronization throughout the caches of different threads.

answered Mar 22 '21 at 13:40

egladil86

43
6

So you are saying that answer is "nothing bad happens"? There is no corruption or crash or something? To be fair this must happen all the time by mistake. – Alasdair Mar 22 '21 at 13:41
And using an atomic means that all the threads will be notified of the change? That is to say atomic variables mean that a variable is not kept in the cache? – Alasdair Mar 22 '21 at 13:42
@Alasdair That, again, depends on your compiler and target architecture: If your code can run on a single CPU, the compiler may decide to keep it in the L3 cache where all your threads running on that CPU can read it. Either way, atomic variables only means "You are certain everyone will know this has changed". Because different architectures work in different ways, there is not one single implementation for the `atomic`. – Adalcar Mar 22 '21 at 13:58
"Threads usually don't read/write directly from/to memory, but to their cache" Are you saying there are some thread caches? – qrdl Mar 22 '21 at 15:52

Writing once and reading many times simultaneously... without locking

2 Answers2