Will atomic operations block other threads?

Question

I am trying to make "atomic vs non atomic" concept settled in my mind. My first problem is I could not find "real-life analogy" on that. Like customer/restaurant relationship over atomic operations or something similar.

Also I would like to learn about how atomic operations places themselves in thread-safe programming.

In this blog post; http://preshing.com/20130618/atomic-vs-non-atomic-operations/ it is mentioned as:

An operation acting on shared memory is atomic if it completes in a single step relative to other threads. When an atomic store is performed on a shared variable, no other thread can observe the modification half-complete. When an atomic load is performed on a shared variable, it reads the entire value as it appeared at a single moment in time. Non-atomic loads and stores do not make those guarantees.

What is the meaning of "no other thread can observe the modification half-complete"?

That means thread will wait until atomic operation is done? How that thread know about that operation is atomic? For example in .NET I can understand if you lock the object you set a flag to block other threads. But what about atomic? How other threads know difference between atomic and non-atomic operations?

Also if above statement is true, do all atomic operations are thread-safe?

Please do not add posted answers to your question. This will break the Q&A format and will make it less readable. In other words, answers don't belong in the question. Thank you. — 2501, Oct 01 '16 at 05:30

Alexey Guseynov · Accepted Answer · 2016-09-30T16:44:13.730

Let's clarify a bit what is atomic and what are blocks. Atomicity means that operation either executes fully and all it's side effects are visible, or it does not execute at all. So all other threads can either see state before the operation or after it. Block of code guarded by a mutex is atomic too, we just don't call it an operation. Atomic operations are special CPU instructions which conceptually are similar to usual operation guarded by a mutex (you know what mutex is, so I'll use it, despite the fact that it is implemented using atomic operations). CPU has a limited set of operations which it can execute atomically, but due to hardware support they are very fast.

When we discuss thread blocks we usually involve mutexes in conversation because code guarded by them can take quite a time to execute. So we say that thread waits on a mutex. For atomic operations situation is the same, but they are fast and we usually don't care for delays here, so it is not that likely to hear words "block" and "atomic operation" together.

That means thread will wait until atomic operation is done?

Yes it will wait. CPU will restrict access to a block of memory where the variable is located and other CPU cores will wait. Note that for performance reasons that blocks are held only between atomic operations themselves. CPU cores are allowed to cache variables for read.

How that thread know about that operation is atomic?

Special CPU instructions are used. It is just written in your program that particular operation should be performed in atomic manner.

Additional information:

There are more tricky parts with atomic operations. For example on modern CPUs usually all reads and writes of primitive types are atomic. But CPU and compiler are allowed to reorder them. So it is possible that you change some struct, set a flag that telling that it is changed, but CPU reorders writes and sets flag before the struct is actually committed to memory. When you use atomic operations usually some additional efforts are done to prevent undesired reordering. If you want to know more, you should read about memory barriers.

Simple atomic stores and writes are not that useful. To make maximal use of atomic operations you need something more complex. Most common is a CAS - compare and swap. You compare variable with a value and change it only if comparison was successful.

David Schwartz · Answer 2 · 2016-09-30T17:19:06.543

8

On typical modern CPUs, atomic operations are made atomic this way:

When an instruction is issued that accesses memory, the core's logic attempts to put the core's cache in the correct state to access that memory. Typically, this state will be achieved before the memory access has to happen, so there is no delay.

While another core is performing an atomic operation on a chunk of memory, it locks that memory in its own cache. This prevents any other core from acquiring the right to access that memory until the atomic operation completes.

Unless two cores happen to be performing accesses to many of the same areas of memory and many of those accesses are writes, this typically won't involve any delays at all. That's because the atomic operation is very fast and typically the core knows in advance what memory it will need access to.

So, say a chunk of memory was last accessed on core 1 and now core 2 wants to do an atomic increment. When the core's prefetch logic sees the modification to that memory in the instruction stream, it will direct the cache to acquire that memory. The cache will use the intercore bus to take ownership of that region of memory from core 1's cache and it will lock that region in its own cache.

At this point, if another core tries to read or modify that region of memory, it will be unable to acquire that region in its cache until the lock is released. This communication takes place on the bus that connects the caches and precisely where it takes place depends on which cache(s) the memory was in. (If not in cache at all, then it has to go to main memory.)

A cache lock is not normally described as blocking a thread both because it is so fast and because the core is usually able to do other things while it's trying to acquire the memory region that is locked in the other cache. From the point of view of the higher-level code, the implementation of atomics is typically considered an implementation detail.

All atomic operations provide the guarantee that an intermediate result will not be seen. That's what makes them atomic.

edited Sep 30 '16 at 17:19

answered Sep 30 '16 at 16:59

David Schwartz

179,497
17
214
278

Hi @David Schwartz, excellent response. Now, considering context switches. In the context switch, that other thread (the one that is going to start) could modify the same variable as the previous thread (the one that was paused) and there the operation would not be atomic, so it would need the mutex to do the atomic operations EVEN IF THEY ARE STILL INTERRUPTED by the context switches, right? Thanks – Nov 04 '22 at 15:53
1

@Daniel A context switch cannot occur during an atomic operation because atomic operations are atomic. On some CPUs, a context switch can occur just before the atomic part of an atomic operation starts, and in that case it's possible that the (pre)fetch logic might have to fetch and lock the data in the cache a second time. – David Schwartz Nov 04 '22 at 16:25
hi @David Schwartz thanks + 1."On some CPUs, a context switch can occur just before the atomic part of an atomic operation starts" - but on those other CPUs that don't do that, you still need the mutex to achieve the critical section and thus achieve atomic operation regardless of context switching (context switching is INEVITABLE as far as I know), right? – Nov 04 '22 at 17:46
1

@Daniel No. Context switching is only permitted at defined points and the CPU has to ensure that this does not result in a context switch causing an atomic operation not to be atomic. If, for example, a CPU allows a context switch after a read but before a write in an atomic operation, it will discard the results of the read and repeat the read when the context switches back. – David Schwartz Nov 05 '22 at 01:22
thanks @David Schwartz, I was misunderstanding the concept of atomicity in this regard. Last question, so a critical section IS NOT AN ATOMIC OPERATION, right? + 1 and MARKED AS A USEFUL ANSWER. – Nov 05 '22 at 01:35
1

@Daniel The operation of acquiring a critical section generally has to be atomic to prevent two threads from thinking they've both acquired the critical section. And you can describe the code inside the critical section as atomic with respect to other threads that need to acquire the critical section. But people use the word "atomic" in a number of different ways and it's not really atomic in quite the same way we talk about atomic instructions. – David Schwartz Nov 05 '22 at 04:04
hi @David Schwartz, I understand, thank you + 1 – Nov 05 '22 at 17:53

score 1 · Answer 3 · edited Apr 13 '17 at 12:49

The atomic operations you describe are instructions within the processor and the hardware will make sure that a read cannot happen on a memory location until the atomic write is complete. This guarantees that a thread either reads the value before write or the value after the write operation, but nothing in-between - there's no chance of reading half of the bytes of the value from before the write and the other half from after the write.

Code running against the processor is not even aware of this block but it's really no different from using a lock statement to make sure that a more complex operation (made up of many low-level instructions) is atomic.

A single atomic operation is always thread-safe - the hardware guarantees that the effect of the operation is atomic - it'll never get interrupted in the middle.

A set of atomic operations is not atomic in the vast majority of cases (I'm not an expert so I don't want to make a definitive statement but I can't think of a case where this would be different) - this is why locking is needed for complex operations: the entire operation may be made up of multiple atomic instructions but the whole of the operation may still be interrupted between any of those two instructions, creating the possibility of another thread seeing half-baked results. Locking ensures that code operating on shared data cannot access that data until the other operation completes (possibly over several thread switches).

Some examples are shown in this question / answer but you find many more by searching.

davmac · Answer 4 · 2016-09-30T17:39:27.267

Being "atomic" is an attribute that applies to an operation which is enforced by the implementation (either the hardware or the compiler, generally speaking). For a real-life analogy, look to systems requiring transactions, such as bank accounts. A transfer from one account to another involves a withdrawal from one account and a deposit to another, but generally these should be performed atomically - there is no time when the money has been withdrawn but not yet deposited, or vice versa.

So, continuing the analogy for your question:

What is the meaning of "no other thread can observe the modification half-complete"?

This means that no thread could observe the two accounts in a state where the withdrawal had been made from one account but it had not been deposited in another.

In machine terms, it means that an atomic read of a value in one thread will not see a value with some bits from before an atomic write by another thread, and some bits from after the same write operation. Various operations more complex than just a single read or write can also be atomic: for instance, "compare and swap" is a commonly implemented atomic operation that checks the value of a variable, compares it to a second value, and replaces it with another value if the compared values were equal, atomically - so for instance, if the comparison succeeds, it is not possible for another thread to write a different value in between the compare and the swap parts of the operation. Any write by another thread will either be performed wholly before or wholly after the atomic compare-and-swap.

The title to your question is:

Will atomic operations block other threads?

In the usual meaning of "block", the answer is no; an atomic operation in one thread won't by itself cause execution to stop in another thread, although it may cause a livelock situation or otherwise prevent progress.

That means thread will wait until atomic operation is done?

Conceptually, it means that they will never need to wait. The operation is either done, or not done; it is never halfway done. In practice, atomic operations can be implemented using mutexes, at a significant performance cost. Many (if not most) modern processors support various atomic primitives at the hardware level.

Also if above statement is true, do all atomic operations are thread-safe?

If you compose atomic operations, they are no longer atomic. That is, I can do one atomic compare-and-swap operation followed by another, and the two compare-and-swaps will individually be atomic, but they are divisible. Thus you can still have concurrency errors.

Thanks. I need some clearance. Regarding to your last statement, if it does not block, in half way of transaction, how second thread will know to come back and complete transaction. I assume there is no concept of spinwait under atomic transaction unless we specify. If first thread is depositing atomically and second one is trying to read the balance, if atomic operation is half way, will second one will return previous value or waits till operation complete? If second case is true, how it knows/notified to come back and read updated value? — Teoman shipahi, Sep 30 '16 at 16:35
@Teomanshipahi you need to understand that atomic operations are really very small: CAS, increment, store, load. So thread context is NEWER switched while it is executed. When we speak about their thread safety we actually speak about threads that are being executed on different CPU cores. — Alexey Guseynov, Sep 30 '16 at 17:06
And there is no atomic operation "block me until somebody else notifies me". You just modify memory atomically and that's all interaction with other threads which is available. — Alexey Guseynov, Sep 30 '16 at 17:20
@Teomanshipahi "Regarding to your last statement, if it does not block, in half way of transaction, how second thread will know to come back and complete transaction." - I'm a bit confused as to what you mean by this. The point is that that with an atomic transaction, the "half-way point" can never be observed by another thread, at all. This can be enforced in various ways, but typically, it's done at the processor level. — davmac, Sep 30 '16 at 17:33
I was looking about that phrase online, and yes it makes more sense now. — Teoman shipahi, Sep 30 '16 at 17:34

Antonio · Answer 5 · 2018-01-27T20:36:04.437

Atomic operation means the system performs an operation in its entirety or not at all. Reading or writing an int64 is atomic (64bits System & 64bits CLR) because the system read/write the 8 bytes in one single operation, readers do not see half of the new value being stored and half of the old value. But be carefull :

long n = 0; // writing 'n' is atomic, 64bits OS & 64bits CLR
long m = n; // reading 'n' is atomic
....// some code
long o = n++; // is not atomic : n = n + 1 is doing a read then a write in 2 separate operations

To make atomicity happens to the n++ you can use the Interlocked API :

long o = Interlocked.Increment(ref n); // other threads are blocked while the atomic operation is running

Very neat example. Thanks! – Teoman shipahi Dec 11 '19 at 15:36 — Teoman shipahi, Dec 11 '19 at 15:36

Will atomic operations block other threads?

5 Answers5

Linked