2

The folly implementation of hazard pointer could be simplified like this (when the asymmetric memory barrier on Linux is used):

Atomic<T*> source_ptr;

1. writer/updater: (retire operation)
    old_ptr = source_ptr.load();
    source_ptr.exchange(....);
    compiler barrier;====================
    move old_ptr to retirement list

2. consumer/reader: (try_protect operation)
    ptr = source_ptr.load();
    compiler barrier;====================
    if (source_ptr.load() == ptr) {
         success; start to use ptr;
    }
    fail; retry to get source_ptr;

3. reclaim:
   read all retirement list;
   heavy memory barrier --> membarrier() call; ===============
   read all hazard pointers;
   reclaim (delete) ptr if ptr is in retirement list but not in hazard pointers.

membarrier() in 3 could synchronize with 1 and 2 above, but there's no synchronization btween 1 and 2 themselves.

So I am wondering if the following could happen:

source ptr: the memory pointer that can be protected by hazard pointer
thread 1: thread that tries to delete the source ptr
thread 2: consumer thread that tries to read source ptr
thread 3: the thread that does the memory reclaim (free memory from retirement list)

At start:
source ptr value = PTR_A

At time 0:
thread 1 (updater): 
change source ptr value from PTR_A to PTR_B and put PTR_A in the retirement list (retire operation);
let's say these results are not visible to the consumer thread (thread 2) but visible to thread 3 (reclaim thread),
because there's only the light barrier.

At time 1:
thread 3 (reclaim thread): 
read the retirement list and found PTR_A in the list.
issue the heavy barrier which is 'membarrier()' call.
Say it first sends IPI to thread 2 and the CPU for thread 2 finishes.

At time 2:
thread 2 (consumer thread):
read the old value (PTR_A) from source ptr (new value not visible yet).
and calls 'try_protect' to protect the pointer.
But this is not visible to other CPU yet, again since there's only compiler barrier now. thread 2 starts to use PTR_A.

At time 3:
the heavy barrier issued by thread 3 at time 1 now reaches thread 1 and finishes.
Now thread 3 starts to collect the hazard pointer list.
But the hazard pointer set by thread 2 at time 2 is not visible to thread 3 yet.
So it doesn't see it. Then thread 3 now could see PTR_A in the retirement list but not in hazard pointer list, which means it would start to delete it.

Is this a bug or I miss something?

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
sheeper
  • 21
  • 1

0 Answers0