4

I'm trying to better understand lock-free programming:

Suppose we have two threads in a data race:

// Thread 1
x = 1

// Thread 2
x = 2

Is there a lock-free way a third thread can know the result of the race without being able to read x?

Suppose thread 3 consumes a lock-free queue, and the code is:

// Thread 1
x = 1
queue.push(1)

// Thread 2
x = 2
queue.push(2)

Then the operations could be ordered as:

x = 1
x = 2
queue.push(1)
queue.push(2)

or

x = 1
x = 2
queue.push(2)
queue.push(1)

So having a lock-free queue alone would not suffice for thread 3 to know the value of x after the race.

Taylor
  • 5,871
  • 2
  • 30
  • 64
  • You'll have to identify which language and threading model you're working with. In some, the result would be "2, then much later, 1". Or thread 3 might see a 1, and thread 4 might see a 2. – Sneftel Jul 23 '18 at 15:24
  • @Sneftel Assume x86 or ARM assembly. – Taylor Jul 23 '18 at 15:29
  • 1
    Those two differ pretty significantly in their concurrency guarantees. In any case, though, you need to realize that there's no clearly defined "winner" of a data race, let alone one that can be determined without checking who won. – Sneftel Jul 23 '18 at 15:52
  • @Sneftel Since the architectures differ significantly, you can answer for both architectures ;-). When both threads finish, x will either have the value 1 or 2. Is that wrong? – Taylor Jul 23 '18 at 15:57
  • Yes, that's (potentially) wrong. I can suggest [this](https://homes.cs.washington.edu/~bornholt/post/memory-models.html) as a good introduction to memory concurrency models. – Sneftel Jul 23 '18 at 16:06
  • @Sneftel: There is a long-term winner, though. Observers spinning on `x` will see its value change at most twice (to `1` and then `2` or the reverse), not to 1, then 2, then back to 1. ARM and x86 both disallow hardware inventing writes like that. On x86 and ARM, I think a lock-free queue will have to use at least release/acquire to work correctly, so if thread 3 does acquire loads of *both* queue elements, then rel/acq sync means it will see the final value of `x`. i.e. as a side-effect / implementation detail, the queue makes all previous stores in thread 1 / 2 visible to thread 3. – Peter Cordes Jul 23 '18 at 16:30
  • Related: [Preventing of Out of Thin Air values with a memory barrier in C++](https://stackoverflow.com/q/51232730). (I have a half-finished edit for that which I should really finish, with more detail and stuff.) But anyway, that discusses the problem of hardware inventing writes. It's not a real thing that can happen except under mis-speculation, i.e. it doesn't matter. No real hardware does value-prediction anyway, although some ISAs are weak enough to allow it. – Peter Cordes Jul 23 '18 at 16:36
  • @Sneftel By the article you linked (it's nice, thanks!): "The single main memory guarantees that there will always be a “winner”: a single last write to each variable. Without this guarantee, after both (1) and (2) have happened, (3) could see either 1 or 2, which is confusing." So, my assertion above was correct. – Taylor Jul 23 '18 at 17:05

1 Answers1

5

If you know the value of x before the race began, the following code using atomic Read-Modify-Write operations should do the job.

// Notes:
// x == 0
// x and winner are both atomic
// atomic_swap swaps the content of two variables atomically, 
// meaning, that no other thread can interfere with this operation

//thread-1:
t = 1;
atomic_swap(x, t);
if (t != 0) {
    //x was non zero, when thread-1 called the swap operation
    //--> thread-2 was faster
    winner = 1;
}    

//thread-2
t = 2;
atomic_swap(x, t);
if (t != 0) {
    //x was non zero, when thread-2 called the swap operation
    //--> thread-1 was faster
    winner = 2;
}  

//thread-3
while (winner == 0) {} 
print("Winner is " + winner);
Domso
  • 970
  • 1
  • 10
  • 22
  • What if I don't know the initial value of x? – Taylor Jul 23 '18 at 17:02
  • @Taylor: Then this clever trick won't work, unless thread 1 and thread 2 communicate with each other to set `x` to some known value before either of them swap. Or if they simply communicate and use that synchronization to establish an order. Do you have some practical reason in mind, because just reading `x` after release/acquire sync with both writers sounds like a cheaper way to find out the final value. – Peter Cordes Jul 23 '18 at 17:08
  • @Taylor x has probably 64Bit, but maybe you just need 32; So you can use the remaining 32Bit to store additional information like the current thread-id into the variable. But as PeterCordes wrote, without a pratical problem you cant evaluate any solution. – Domso Jul 23 '18 at 17:20
  • @PeterCordes I'm afraid I can't discuss the application. I suppose I could ask a new question about determining the value without any shared memory, which I probably should have specified. – Taylor Jul 23 '18 at 17:50
  • @Taylor: Without some kind of extra channel like sending signals or messages over sockets (which is obviously much slower than shared memory, but is *not* shared memory), I don't think there's any solution other than constraining the order of doing the stores, i.e. not having a race at all. e.g. maybe thread 2 waits until it can pop the queue data that thread 1 pushed before doing its store. Or maybe the threads could xchg and then encode the result into what they push. But atomic RMW is much more expensive than atomic store. On x86, `xchg` is always a full barrier, implicit `lock` prefix. – Peter Cordes Jul 23 '18 at 18:34