Lock-free queue: why read `Atomic*` twice?

Question

I am reading The Art of Multiprocessor Programming, 2ed and I noticed the following pattern is used to read several Atomic* fields:

while (true) {
    var v1 = atomic1.get();
    var v2 = atomic2.get();
    ...
    var vN = atomicN.get();
    if (v1 == atomic1.get()) {
        // do work
    }
}

What is the purpose of this construction?

The only explanation I found in the book is:

... checks that the values read are consistent ...

I don't understand this explanation.

Here is LockFreeQueue, which uses this pattern, from the book:

public class LockFreeQueue<T> {
  
  AtomicReference<Node> head, tail;

  public LockFreeQueue() {
    Node node = new Node(null);
    head = new AtomicReference(node);
    tail = new AtomicReference(node);
  }

  public void enq(T value) {
    Node node = new Node(value);
    while (true) {
      Node last = tail.get();
      Node next = last.next.get();
      if (last == tail.get()) {   // <=== WHY: reading tail.get() again
        if (next == null) {
          if (last.next.compareAndSet(next, node)) {
            tail.compareAndSet(last, node);
            return;
          }
        } else {
          tail.compareAndSet(last, next);
        }
      }
    }
  }

  public T deq() throws EmptyException {
    while (true) {
      Node first = head.get();
      Node last = tail.get();
      Node next = first.next.get();
      if (first == head.get()) {  // <=== WHY: reading head.get() again
        if (first == last) {
          if (next == null) {
            throw new EmptyException();
          }
          tail.compareAndSet(last, next);
        } else {
          T value = next.value;
          if (head.compareAndSet(first, next))
            return value;
        }
      }
    }
  }
}

public class Node {
  
  public T value;
  public AtomicReference<Node> next;

  public Node(T value) {
    this.value = value;
    next = new AtomicReference<Node>(null);
  }
}

I saw another similar question on SO: Lock-free queue algorithm, repeated reads for consistency.
But:

the accepted answer has negative score and states that all would work without repeated reads, but offers no proofs
it discusses a different algorithm: that algorithm frees nodes explicitly, while the book is mostly about algorithms in java (where nodes are freed implicitly by the GC).

UPD: the book says that LockFreeQueue is a slightly simplified version of a queue algorithm by Maged Michael and Michael Scott.
This is the same algorithm as the one discussed in the similar SO question mentioned above.

From looking at just the first code block, and the "consistent" comment, looks like the same idea as a seqlock: [Implementing 64 bit atomic counter with 32 bit atomics](https://stackoverflow.com/q/54611003) / [Which of these implementations of seqlock are correct?](https://stackoverflow.com/q/56419723) / https://en.wikipedia.org/wiki/Seqlock - detect if you're gotten a consistent snapshot of the whole thing without any intervening writes. — Peter Cordes, Jun 02 '21 at 03:18

score 5 · Accepted Answer · answered Jun 02 '21 at 06:38

I think that the general idea is that writers will update the fields in a given order, and that the value of the first field will always be changed for each "update". So if a reader sees that the first field didn't change on the second read, then it knows that it has read a consistent set of values (snapshot) for all of the fields.

score 1 · Answer 2 · 2021-06-03T01:19:38.710

Thank you Peter Cordes and Stephen C for your answers.
I think now I understand, and below is my attempt to explain this in details:

Turns out LockFreeQueue is a simplified version of the queue algorithm by Maged Michael and Michael Scott.

In the original algorithm the repeated reads are indeed used to read a consistent set of values (snapshot) for all of the fields:

To obtain consistent values of various pointers we rely on sequences of reads that re-check earlier values to be sure they haven’t changed. These sequences of reads are similar to, but simpler than, the snapshots of Prakash et al.(we need to check only one shared variable rather than two).

The simplified LockFreeQueue actually works correctly without the repeated reads (at least this is what I get — all the safety properties mentioned in the paper hold always, even if I remove the repeated reads).
Although it is possible that the repeated reads provide better performance.

The original algorithm uses the repeated reads also for correctness (aka safety).
This is mainly because the algorithm reuses Node objects removed from the queue.
Java version of the full algorithm LockFreeQueueRecycle<T> is given later in the book (it uses AtomicStampedReference instead of AtomicReference):

/* 1  */ public T deq() throws EmptyException {
/* 2  */     int[] lastStamp = new int[1];
/* 3  */     int[] firstStamp = new int[1];
/* 4  */     int[] nextStamp = new int[1];
/* 5  */     while (true) {
/* 6  */         Node first = head.get(firstStamp);
/* 7  */         Node last = tail.get(lastStamp);
/* 8  */         Node next = first.next.get(nextStamp);
/* 9  */         if (head.getStamp() == firstStamp[0]) {
/* 10 */             if (first == last) {
/* 11 */                 if (next == null) {
/* 12 */                     throw new EmptyException();
/* 13 */                 }
/* 14 */                 tail.compareAndSet(last, next,
/* 15 */                        lastStamp[0], lastStamp[0]+1);
/* 16 */             } else {
/* 17 */                 T value = next.value;
/* 18 */                 if (head.compareAndSet(first, next, firstStamp[0],
                                    firstStamp[0]+1)) {
/* 19 */                     free(first);
/* 20 */                     return value;
/* 21 */                 }
/* 22 */             }
/* 23 */         }
/* 24 */     }
/* 25 */ }

Here free(first) on line 19 makes the Node object available for reuse.

Repeated read head.getStamp() == firstStamp[0] on line 9 allows us to read consistent values for head, tail and head.next.
The fact that head.getStamp() didn't change means that head didn't change ⇒ no nodes were removed from the queue ⇒ tail and head.next point to correct (not yet recycled) nodes.
Without the check on line 9 an incorrect behavior is possible: imagine that right after line 7:

we have first == last, first.next !== null
the current thread is paused by the OS
another thread executes deq() several times until first node is recycled. During recycling first.next is set to null.
the current thread is resumed by the OS
line 8: next = null — we read an incorrect value from the reused first node
line 9: skipped
line 10: first == last is true
line 11: next == null is true
line 12: EmptyException is thrown incorrectly (event if the queue has never been empty during the execution of the enq()).

Another example of an incorrect behavior is shown in this answer.

Lock-free queue: why read `Atomic*` twice?

2 Answers2