There's lots of ways to implement a mutex lock, but it typically starts with the basic premise that the cpu architecture offers some concept of atomic add and atomic subtract. That is, an addition operation can be done to an integer variable in memory (and return the result) without being corrupted by another thread attempting to access same memory location. Or at the very least, "atomic increment" and "atomic decrement".
On modern Intel chips, for example, there's an instruction called XADD
. When combined with the LOCK
prefix it executes atomically and invalidates cached values across other cores. gcc implements a wrapper for this instruction called __sync_add_and_fetch
. Win32 implements a similar function called InterlockedIncrement
. Both are just calling LOCK XADD
under the hood. Other CPU architectures should offer something similar.
So the most basic mutex lock could be implemented something like this. This is often called a "spin" lock. And this cheap version offers no ability to recursively enter the lock.
// A correct, but poorly performant mutex implementation
void EnterLock(int* lock)
{
while (true)
{
int result = LOCK_XADD(lock,1); // increment the value in lock and return the result atomically
if (result == 1)
{
// if the value in lock was successfully incremented
// from 0 to 1 by this thread. It means this thread "acquired" the lock
return;
}
LOCK XADD(lock,-1); // we didn't get the lock - decrement it atmoically back to what it was
sleep(0); // give the thread quantum back before trying again
}
}
void LeaveLock(int* lock)
{
LOCK XADD(lock,-1); // release the lock. Assumes we successfully acquired it correctly with EnterLock above
}
The above suffers from poor performance of "spinning" and doesn't guarantee any fairness. A higher priority thread could continue to win the EnterLock battle over a lower priority thread. And the programmer could make a mistake and call LeaveLock with with a thread that did not previously call EnterLock. You could expand the above to operate on a data structure that not only includes the lock integer, but also has record keeping for the owner thread id and a recursion count.
The second concept for implementing a mutex is that the operating system can offer a wait and notify service such that a thread doesn't have to spin until the owner thread has released it. The thread or process waiting on lock can register itself with the OS to be put to sleep until the owner thread has released it. In OS terms, this is called a semaphore. Additionally, the OS level semaphore can also be used to implement locks across different processes and for the cases where the CPU doesn't offer an atomic add. And can be used to guaranteed fairness between multiple threads trying to acquire the lock.
Most implementations will try spinning for multiple attempts before falling back to making a system call.