Most efficient memory order to increment an integer

Question

I have N threads, which operates 1 variable of std::atomic type. Like this:

std::atomic<int> Num = 0;

void thr_func()
{
    Num.fetch_add(1);
}

Here, the default memory order is memory_order_seq_cst, which is not most efficient. Which memory order and why i should use to get the most effective code and also have a consistent state of Num? I googled it, and found a lot of answers, that memory_order_relaxed but i don't understand, why something like this will not happen with it? Because documentation of relaxed order says: there are no synchronization or ordering constraints imposed on other reads or writes

thr1: load Num
thr2: load Num
thr1: add 1 
thr2: add 1
thr1: store Num
thr2: store Num

You use the memory order that has the properties you need. This is dictated by how you use it. As such, there is no "best"; there is only that which fits your usage patterns. Patterns that you have not explained, so we can't really help you.. — Nicol Bolas, Jul 19 '22 at 14:11
fetch_add is an atomic operation, your output cannot happen regardless of memory_order. Memory order affects only how fetch_add is treated with other operations. You may want to read this: https://stackoverflow.com/questions/54639439/will-fetch-add-with-relaxed-memory-order-return-unique-values — freakish, Jul 19 '22 at 14:12
Not _exactly_ a duplicate, but there are some comprehensive examples here: https://stackoverflow.com/questions/6319146/c11-introduced-a-standardized-memory-model-what-does-it-mean-and-how-is-it-g/6319356#6319356 — Chad, Jul 19 '22 at 14:13
The code you show here only accesses one variable. The memory order has no bearing on the behavior of this code. Memory order restricts how _other_ memory is accessed, relative to your `fetch_add`. — Drew Dormann, Jul 19 '22 at 14:20

score 1 · Accepted Answer · answered Jul 19 '22 at 14:47

std::atomics are not only about consistent state of themselves, but also about consistent state in the surrounding code. Say for example, that you use an atomic integer to store the number of items in an array. You will probably end up writing something like the following:

std::atomic<int> len;
...

array[len] = some_new_object;
len++;

In another thread you would wait for len to change and access the newly added object afterwards. For this to function properly it is crucial that the len++; statement happens strictly after the statement before. Usually the compiler as well as the processor are allowed to reorder instructions, as long as the resulting effect is the same (according to the as-if rule). For interthread synchronization you want to restrict this reordering and that is exactly what the std::atomic types do.

With memory_order_seq_cst for example, the expression len++ being a read-modify-write access, will not allow any other instruction to be reordered with it. If you used memory_order_relaxed, which does not restrict instruction reordering, the len variable could end up being increased before the array[len] = some_new_object; expression is completed. That is obviously not what you want in the example above.

So to conclude, in the example that you provided in your question, you might as well use memory_order_relaxed (the atomicity of the operation is still guaranteed and the output you depicted will not happen). But as soon as you use the std::atomic variable to actually signal some state between the threads, you should use memory_order_seq_cst (which is the default one for good reason).

Most efficient memory order to increment an integer

1 Answers1