There are 2 applications running on different cores. One application writes to a shared memory location of 4 bytes and the other application on another core reads. These 2 applications are running simultaneously. This model works for me without using mutex or semaphores. Are there any downfalls to this ? Even in this case is a mutex or a semaphore required?
-
1My question would be, how does the reader know the writer has finished writing? And the writer know the reader is done reading? If you can't answer that, you have a race condition. – erik258 Jan 17 '18 at 00:18
-
@DanFarrell The reader basically polls for the shared integer to change, if it changes then the reader knows its an updated value. Writer need not know that the reader is done reading. – sanjeev rao Jan 17 '18 at 00:22
-
1If there are consecutive writes, there is a chance that the reader might miss one. – Isuru H Jan 17 '18 at 08:36
2 Answers
TL:DR: It's safe in C/C++ if you do it with std::atomic<uint32_t>
using std::memory_order_relaxed
loads/stores. In asm it's safe on all modern architectures.
It's "safe" without a mutex on all "normal" CPU architectures, but it doesn't provide very much synchronization. You can only use it for simple stuff because of these limitations:
- The consumer can miss updates if two stores from the producer happen between reads (from the reader's perspective). Even if the consumer is normally faster than the producer, the reader could sleep because of a page fault or context switch. Unless you're running under a hard real-time OS, in which case you do have guarantees about maximum latency.
The consumer can read the same value twice. If your data is naturally immune to the ABA problem (the producer never stores the same value twice in a row, e.g. a counter that can't possibly wrap around between reads), then this isn't a problem either.
But if the consumer is just waiting for new values, you should probably use a lockless queue (in a fixed-size circular buffer). That lets the reader sleep if it empties the queue, or the producer can block if it fills the queue. It also lets the consumer process a couple items without the producer having to wake up to store them one at a time. It can be implemented very efficiently for the single-producer single-consumer case, and quite efficiently even for the multi-writer multi-reader case (where you have to stop writers from racing with other writers to produce some kind of total order).
You should just look for a library implementation of a lockless / lock-free queue, unless you're really sure you want to have a reader spinning on a value, waiting for it to change.
Normally this write-only / read-only pattern would only be used for something like a timestamp or "current value" of something. The readers aren't trying to see every update, they just read it when they need that value.
On mainstream modern CPU architectures, aligned word load / store instructions are atomic, so you won't see "tearing" (a mix of bytes from two different stores). For the x86 details, see Why is integer assignment on a naturally aligned variable atomic on x86?.
If you're writing in asm, obviously you have to know the details of how the machine works.
In C or C++, you of course need to use an _Atomic
or std::atomic
variable, otherwise your program has an Undefined Behaviour data race. It might happen to work, or load of the shared variable might get hoisted out of the loop. Using myvar.store(newvalu, std::memory_order_relaxed)
will make your loads/stores pretty much exactly as efficient regular integer assignment. I.e. if you had code which happened to work with just int
, then using std::atomic<int>
with memory_order_relaxed
shouldn't slow it down any, and may not even change the compiler's asm output at all. (But it guarantees you correctness with different surrounding code or optimization options!).
Following the ISO C++11 rules correctly will make this producer-consumer pattern work on any conforming C++ implementation. On some obscure platform where data races are a problem for the hardware (not just the C++ optimizer), your atomic variable will use a mutex instead of being lock-free.

- 328,167
- 45
- 605
- 847
It depends on the platform. Without knowing the platform, there's no way we can even imagine how it could fail. Whether or not a mutex or a semaphore is required depends on whether the platform's documentations says they're required. You cannot tell by experimentation -- walking across a street without looking both ways and not getting hit by a car doesn't show that it's safe. The next CPU, next compiler, next system library or even next system upgrade might cause it to fail, or it might just fail only very rarely.

- 179,497
- 17
- 214
- 278