My understanding is that a spinlock can be implemented using C++11 atomics with an acquire-CAS on lock and a release-store on unlock, something like this:
class SpinLock {
public:
void Lock() {
while (l_.test_and_set(std::memory_order_acquire));
}
void Unlock() {
l_.clear(std::memory_order_release);
}
private:
std::atomic_flag l_ = ATOMIC_FLAG_INIT;
};
Consider its use in a function that acquires a lock and then does a blind write to some shared location:
int g_some_int_;
void BlindWrite(int val) {
static SpinLock lock_;
lock_.Lock();
g_some_int_ = val;
lock_.Unlock();
}
I'm interested in how the compiler is restricted in translating this to generated assembly code.
I understand why the write to g_some_int_
can't migrate past the end of the
critical section in the compiler's output -- that would mean that the write
isn't guaranteed to be seen by the next thread that acquires the lock, which is
guaranteed by the release/acquire ordering.
But what prevents the compiler from moving it to before the acquire-CAS of the lock flag? I thought that compilers were allowed to re-order writes to different memory locations when generating assembly code. Is there some special rule that prevents a write from being re-ordered to before an atomic store that precedes it in program order?
I'm looking for a language lawyer answer here, preferably covering std::atomic
as well as std::atomic_flag
.
Edit to include something from the comments which maybe asks the question
more clearly. The heart of the question is: which part of the standard says that
the abstract machine must observe l_
being false
before it writes to
g_some_int_
?
I suspect the answer is either "writes can't be lifted above potentially infinite loops" or "writes can't be lifted above atomic writes". Perhaps it's even "you're wrong that writes can ever be re-ordered at all". But I'm looking for a specific reference in the standard.