3

I recently came across the following code while learning about Reentrant Locks in Lock-Free Concurrency:

class ReentrantLock32
 {
  std::atomic<std::size_t> m_atomic;
  std::int32_t m_refCount;

public:
  ReentrantLock32() : m_atomic(0), m_refCount(0) {}

  void Acquire()
   {
    std::hash<std::thread::id> hasher;
    std::size_t tid = hasher(std::this_thread::get_id());

    if (m_atomic.load(std::memory_order_relaxed) != tid)
     {
       std::size_t unlockValue = 0;
       while (!m_atomic.compare_exchange_weak(
        unlockValue,
        tid,
        std::memory_order_relaxed,
        std::memory_order_relaxed))
       {
        unlockValue = 0;
        PAUSE();
       }
      }
      ++m_refCount;
      std::atomic_thread_fence(std::memory_order_acquire);
     }

  void Release() {
   std::atomic_thread_fence(std:memory_order_release);
   std::hash<std::thread::id> hasher;
   std::size_t tid = hasher(std::this_thread::get_id());
   std::size_t actual = m_atomic.load(std::memory_order_relaxed);
   assert(actual == tid);

   --m_refCount;
   if (m_refCount == 0)
    {
     m_atomic.store(0,std::memory_order_relaxed);
    }
 }
//...
}

However, I'm unsure whether the release fence call doesn't preclude the possibility of later memory operations in the thread preceding it and whether the acquire fence precludes the possibility of earlier memory operation succeeding it. If they don't, wouldn't it technically be possible that an optimisation could cause the line

   if (m_refCount == 0)

to be suceeded by a complete and successful call to Acquire() on the same thread before the call to

     m_atomic.store(0,std::memory_order_relaxed);

in which case the valid incrementation in the reordered Acquire() call would be overwritten by the delayed store() call?

When analyzing this code it also occurred to me that there might be stale data issues which lead to duplicate locks which is questioned here.

There is also another related question to clarify the potential order of memory operations for release fence calls here.

Josh Hardman
  • 721
  • 6
  • 17

1 Answers1

3

That can't happen.

The situation you mention takes place within a thread. A variable is always sequentially consistent with itself within a thread. (Otherwise it would be impossible to program.)

If, for example, m_atomic.store(0,std::memory_order_relaxed); is stuck in the store buffer, the CPU knows to look there for the load in this line: if (m_atomic.load(std::memory_order_relaxed) != tid)

Regardless of how relaxed a variable is, within a thread, optimizations aren't allowed to change the semantics of the source code. Atomics only exist to provide visibility ordering guarantees to other threads.

BTW, it's not accurate to say this:

the release fence doesn't preclude the possibility of a later operation in the thread preceding it

According to Jeff Preshing:

A release fence prevents the memory reordering of any read or write which precedes it in program order with any write which follows it in program order. https://preshing.com/20130922/acquire-and-release-fences/

This means that in theory C++ acquire/release atomic thread fences are a bit more strict than the common notion of acquire/release memory barriers, by which reordering is allowed 1-way.

Humphrey Winnebago
  • 1,512
  • 8
  • 15
  • Thank you for the clarification. I'm still confused about when release fence statement in C++ becomes a guarantee and whether reordering can occur in the interim so I've posted a separate question at the following link: https://stackoverflow.com/questions/75077371/when-exactly-does-a-release-fence-call-in-c-call-go-from-being-a-statement-to?noredirect=1#comment132489259_75077371 – Josh Hardman Jan 11 '23 at 02:01
  • 1
    The C++-level version of what you say about store-forwarding is that within a single thread, earlier statements are *sequenced before* later ones, and that's sufficient for *happens before*, i.e. for visibility of their side-effects. – Peter Cordes Jan 11 '23 at 06:21
  • 1
    Acquire / release *operations* are 1-way, but acquire/release *fences* have to be 2-way for loads or stores respectively. (LoadLoad or StoreStore, and they have LoadStore in common). So a release fence doesn't stop a later *load* preceding it, only a later store. See also https://preshing.com/20131125/acquire-and-release-fences-dont-work-the-way-youd-expect/ which goes deeper into the point about how they're formally defined in ISO C++ as 2-way barriers. – Peter Cordes Jan 11 '23 at 06:26