12

Can I perform arithmetic operations on an atomic variable directly?

Since I find the C standard library provides a lot of utility functions like atomic_fetch_add to perform the addition between an atomic variable and a non-atomic variable. But, I am curious since the variable is atomic, can I have arithmetic operations directly on it? Like in the code shown below:

#include <threads.h>
#include <stdio.h>
#include <stdatomic.h>
atomic_int i = 0;
int run(void* v) {
  i += 100;  // <- is this operaiton thread-safe?
  // atomic_fetch_add(&i, 100);
  printf("%d\n", i);
  return thrd_success;
}
int main(void) {
  thrd_t thread;  
  thrd_create(&thread, run, NULL);
  thrd_join(thread, NULL);
  return 0; 
} 
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Jason Yu
  • 1,886
  • 16
  • 26

2 Answers2

11

Compound assignments to variables with atomic types are explicitly allowed in C2011: quoting N1570 §6.15.6.2p3

A compound assignment of the form E1 op= E2 is equivalent to the simple assignment expression E1 = E1 op (E2), except that the lvalue E1 is evaluated only once, and with respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation. If E1 has an atomic type, compound assignment is a read-modify-write operation with memory_order_seq_cst memory order semantics. [footnote 113]

Emphasis mine. Footnote 113 goes on to give a specific example of how E1 op= E2 can be translated into <stdatomic.h> primitive operations when E1 is an atomic type.

The other operators that I'm confident one may apply to atomic types are postfix ++ and --, which are also guaranteed to perform an atomic read-modify-write (§6.5.2.4), and simple assignment, which is guaranteed to perform an atomic load or store as appropriate (§6.2.6.1).

Caution: Prefix ++ and -- are not guaranteed to perform an atomic read-modify-write (compare 6.5.2.4 with 6.5.3.1).

I read this sentence of 6.2.6.1

Loads and stores of objects with atomic types are done with memory_order_seq_cst semantics.

as implying that is valid to use an atomic lvalue as an operand to most other operators, and an atomic load is performed, after which the value is not treated specially. Don't quote me on this part.

C++ may have different rules. I will leave exegesis of the C++ standard to someone else.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • 1
    C++ doesn't have an `_Atomic` keyword; it only has a `std::atomic` class which provides operator overloads. So its compound-assignment operators are just a library implementation detail, not a special case of a first-class language feature for the compiler itself. (C++ prefix `--` and `++` [are atomic operations](https://en.cppreference.com/w/cpp/atomic/atomic/operator_arith), defined as `x.fetch_add(1)+1` or `x.fetch_sub(1)-1`); I'm very surprised they're not guaranteed to be like that in C!) – Peter Cordes Jan 03 '22 at 19:46
  • 1
    GCC and Clang implement prefix ++ atomically as I expected: https://godbolt.org/z/j9M1fGTPz with no warnings. So even if ISO C chooses not to guarantee things about pre-inc/dec, those implementations do seem to. – Peter Cordes Jan 03 '22 at 19:53
  • 3
    [6.5.3.1](http://port70.net/~nsz/c/c11/n1570.html#6.5.3.1) says *The expression `++E` is equivalent to `(E+=1)`*, with no caveat about atomics. So **prefix `++`/`--` actually are guaranteed to be atomic RMWs**, because compound-assignment operations like `E += 1` are guaranteed per §6.15.6.2p3 which you quoted at the top of your answer. So `++` isn't a booby-trap in ISO C after all. – Peter Cordes Jan 03 '22 at 20:17
  • @PeterCordes The drafters of the C standard _could have_ put in an explicit "this is an atomic RMW" guarantee for prefix ++, as they did for postfix ++. They chose not to. That means I do not think it is safe to rely on the equivalence you mention to supply an implicit guarantee. – zwol Jan 04 '22 at 05:24
  • IMO that would just have been clutter. The way they define it shows that it's just source-level syntactic sugar that's exactly equivalent to existing syntax. Just like how they don't have to repeat all the pointer-math rules for `[]`, they can just say that `x[y]` is `*(x+y)`. (I didn't check the `[]` section to compare how that's worded, I'm confident enough in my reading of the `++` part, especially when it's supported by sanity, and working in practice.) The reason post-fix `++` is different is I assume because it gives you something you can't get as directly with other syntax. – Peter Cordes Jan 04 '22 at 07:45
6

The i += 10; statement maintains the atomicity of i, because it is used as an lvalue expression. From cppreference (bolding mine):

Built-in increment and decrement operators and compound assignment are read-modify-write atomic operations with total sequentially consistent ordering (as if using memory_order_seq_cst). If less strict synchronization semantics are desired, the standard library functions may be used instead.

The example given on the linked page uses the built-in (pre-)increment operation on the acnt variable, but it could just as well have used a compound assignment, as your code does.

However, more complex arithmetic operations may cause the i variable to lose its atomicity, if it is not used strictly as an lvalue expression. From the same page:

Atomic properties are only meaningful for lvalue expressions. Lvalue-to-rvalue conversion (which models a memory read from an atomic location to a CPU register) strips atomicity along with other qualifiers.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • Presumably, using `i = i + 100;` will break the atomicity, because `i` is being used in both **lvalue** and **rvalue** expressions (although many/most compilers may optimize that to the equivalent of your code). – Adrian Mole Jan 03 '22 at 05:05
  • I actually tried several common compilers and *none* of them compile `i = i + 100` into an atomic RMW. – Nate Eldredge Jan 03 '22 at 05:40
  • @Nate - Interesting. I wonder why not? Does that 'optimization' break some form of the "as if" rule? – Adrian Mole Jan 03 '22 at 05:43
  • 1
    I'm not sure the RMW is actually an optimization. On x86 it is, only because the compiler already emits `xchg` for the store to get the sequential consistency barrier. But on, say, ARM64, I think it would be a deoptimization both for size and speed; `i = i + 100` needs an acquire load and release store, and to get atomic RMW you have to make them exclusive and add a retry loop. – Nate Eldredge Jan 03 '22 at 05:59
  • Amplifying (and nitpicking, sorry) @NateEldridge’S comment: (1) x86 has LOCK XADD, etc so (2) I am surprised that an x86 compiler would use XCHG in an atomic add memory, since that requires a retry loop, as do most RISCs with their LL/SC type code sequences which may be atomic if they succeed, but which are not guaranteed to succeed in many cases. I.e. I would be very worried about you using those in a nuclear reactor. – Krazy Glew Jan 03 '22 at 17:36
  • @KrazyGlew: Compilers *don't* use `xchg` as part of an atomic add to memory; they use `lock xadd` if the old value is used (https://godbolt.org/z/j9M1fGTPz), otherwise `lock add` or `lock inc`. But `i = i + 100` isn't an atomic add to memory, and [compilers don't optimize atomics](https://stackoverflow.com/questions/45960387/why-dont-compilers-merge-redundant-stdatomic-writes). – Peter Cordes Jan 03 '22 at 20:02
  • Instead, they keep it as written: an atomic pure-load with seq_cst semantics, and a separate atomic pure-store with seq_cst semantics. `xchg` is more efficient than `mov + mfence` to implement that. [Why does a std::atomic store with sequential consistency use XCHG?](https://stackoverflow.com/q/49107683). @Nate is right: *if* compilers were optimizing across atomic ops, they might as well turn `i = i + 100` into `i += 100` when compiling for x86 because of the seq_cst default, but yeah, ARM64 would do better as written, with just LDAR and STLR, not needing an LL/SC retry loop. – Peter Cordes Jan 03 '22 at 20:03
  • @PeterCordes: Right, I didn't mean that a compiler would use `xchg` in a loop for atomic add. The compilers I checked do `mov eax, [i] ; add eax, 100 ; xchg [i], eax` where the `xchg` is just used as a store, replacing `mov [i], eax ; mfence`. So since `xchg` and `lock add` are both atomic RMW, I would expect `lock add [i], 100` to be no worse. – Nate Eldredge Jan 04 '22 at 15:25
  • @NateEldredge: perhaps you intended that reply for \@KrazyGlew; I know how either version will compile :P. But yeah, I'd also expect `lock add` to be no worse than load / `add` reg / `xchg`. It might keep the cache line locked/pinned for an extra cycle or something, but similar number of uops, so even for the hot-in-cache case (where cache-miss overhead doesn't dominate everything) it's probably about the same. – Peter Cordes Jan 04 '22 at 21:55
  • My bad, I was thinking compare exchange spinloop rather than exchange. BTW IMHO intel should have allowed the LOCK prefix on the ordinary stores, to make them semantically strongly ordered. But that’s water under the bridge. – Krazy Glew Jan 05 '22 at 01:58