3

Please see the following code brought from http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4507.pdf, page 12:

using namespace std::experimental::parallel;
std::atomic<int> x = 0;
int a[] = {1,2};
for_each(par, std::begin(a), std::end(a), [&](int n) {
  x.fetch_add(1, std::memory_order_relaxed);
  // spin wait for another iteration to change the value of x
  while (x.load(std::memory_order_relaxed) == 1) { }
});

There is a comment attached in this example:

The above example depends on the order of execution of the iterations, and is therefore undefined (may deadlock).

But I can't see why it may lead to a deadlock. As I understood, although the memory order is specified as std::memory_order_relaxed, this is only about the timing a thread sees some changes made by another thread, so eventually (after a possibly unbounded amount of time) that changes should be anyway noticed by the reading thread.

Can someone explain? Thank you!

Junekey Jeon
  • 1,496
  • 1
  • 11
  • 18
  • I believe possibly because that *another thread* might be waiting for *this* thread (assuming even *this* thread is using `std::memory_order_relaxed`) for some changes. Afterall, that is how a deadlock happens. – abhishek_naik Mar 13 '17 at 00:57
  • 2
    if any two thread fetches and adds before the spinwait, it is 2 now so all stuck at spinwait for 1 – huseyin tugrul buyukisik Mar 13 '17 at 00:59
  • @huseyintugrulbuyukisik If it is 2 now then all should not stuck, since they wait while it is 1, not 2. – Junekey Jeon Mar 13 '17 at 01:03
  • @abhishek_naik Could you explain in more detail? Maybe in an answer form, not a comment if long? The threads are reading new value while they are waiting, so at some point they should see the changes they made, as I understood. – Junekey Jeon Mar 13 '17 at 01:07
  • @JunekeyJeon it just updates atomically but may not read atomically(just some cached 1 value) – huseyin tugrul buyukisik Mar 13 '17 at 01:09

3 Answers3

3

The reason the example code can deadlock has nothing to do with the use of memory_order_relaxed.

A change to an atomic variable will become visible to another core and if it does not (according to the standard it should become visible), it is not related to memory orderering which is only used to specify how other memory operations are ordered with respect to the atomic operation.

The example given in the document the link refers to can deadlock because apparently it is not guaranteed that the execution is truly concurrent. In a later draft (N4640), the text has been revised:

... depends on the order of execution of the iterations, and will not terminate if both iterations are executed sequentially on the same thread of execution.

And that is what this is all about; If both iterations are executed sequentially, the first keeps spinning on a value that will never change.

LWimsey
  • 6,189
  • 2
  • 25
  • 53
2

The threads are reading new value while they are waiting, so at some point they should see the changes they made, as I understood.

No, it doesn't.

Relaxed memory order does not mean "things from other threads may eventually be ordered before this read." It means "things from other threads are not ordered before this read." That is, any other writes may never become visible.

Hence UB.

Now on practical systems, it is entirely possible that the relaxed memory order (particularly for lock-free atomics) may still make other threads' modifications visible. But as far as the standard is concerned, this is UB. And an overzealous compiler could indeed compile your code into a hard while(true);, never bothering to read from the variable because it knows that it cannot see changes from other threads.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Very clear, thank you! And just another curiosity... What if I qualify the variable as ````volatile````? Can you explain what happens then? It may prevent the compiler to make the code into ````while(true)```` right? But theoretically anyway deadlock is possible, right? Thank you! – Junekey Jeon Mar 13 '17 at 01:27
  • 2
    @JunekeyJeon: No. `volatile` doesn't help. [`volatile` *never helps*. Not when it comes to threading.](http://stackoverflow.com/a/12878500/734069) It would be just as much UB with `volatile` as without. – Nicol Bolas Mar 13 '17 at 01:40
  • Umm, what I expect from adding ````volatile```` is just to prevent an "overzealous compiler" you mentioned make the code into ````while (true)````. Even that is not true? I know ````volatile```` has nothing to do with threads. – Junekey Jeon Mar 13 '17 at 01:55
  • Anyway, reading your link helped me a lot. It was very informative! – Junekey Jeon Mar 13 '17 at 01:56
  • @NicolBolas Where in the standard does it say that dealing with relaxed atomics this way is UB ? – LWimsey Mar 13 '17 at 19:55
1
x.fetch_add

makes the increment atomic but

std::memory_order_relaxed

isn't a sync point for threads so any two threads can have 1 at the same time. Because read-modify-write starts without synchronization, getting not a desired x value.

huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97