I recently came across the following code while learning about Reentrant Locks in Lock-Free Concurrency:
class ReentrantLock32
{
std::atomic<std::size_t> m_atomic;
std::int32_t m_refCount;
public:
ReentrantLock32() : m_atomic(0), m_refCount(0) {}
void Acquire()
{
std::hash<std::thread::id> hasher;
std::size_t tid = hasher(std::this_thread::get_id());
if (m_atomic.load(std::memory_order_relaxed) != tid)
{
std::size_t unlockValue = 0;
while (!m_atomic.compare_exchange_weak(
unlockValue,
tid,
std::memory_order_relaxed,
std::memory_order_relaxed))
{
unlockValue = 0;
PAUSE();
}
}
++m_refCount;
std::atomic_thread_fence(std::memory_order_acquire);
}
void Release() {
std::atomic_thread_fence(std:memory_order_release);
std::hash<std::thread::id> hasher;
std::size_t tid = hasher(std::this_thread::get_id());
std::size_t actual = m_atomic.load(std::memory_order_relaxed);
assert(actual == tid);
--m_refCount;
if (m_refCount == 0)
{
m_atomic.store(0,std::memory_order_relaxed);
}
}
//...
}
However, I'm unsure whether the release fence call doesn't preclude the possibility of later memory operations in the thread preceding it and whether the acquire fence precludes the possibility of earlier memory operation succeeding it. If they don't, wouldn't it technically be possible that an optimisation could cause the line
if (m_refCount == 0)
to be suceeded by a complete and successful call to Acquire() on the same thread before the call to
m_atomic.store(0,std::memory_order_relaxed);
in which case the valid incrementation in the reordered Acquire() call would be overwritten by the delayed store() call?
When analyzing this code it also occurred to me that there might be stale data issues which lead to duplicate locks which is questioned here.
There is also another related question to clarify the potential order of memory operations for release fence calls here.