InterviewQ: How do you code a mutex?

Question

I'm unfortunately out of a job and have been interviewing around lately. I faced this same question twice now, and was lost both times I was asked this question.

"How do you code a mutex"?

Conceptually I understand a mutex locks a certain part of code so multiple threads can not enter the critical section at the same time, eliminating data races. The first time I was asked to conceptually describe how I would code it, the second time I was asked to code it. I've been googling and haven't found any answers... can anyone help?

Thanks.

"You don't code a mutex yourself. You use the one in the standard library, because there are details involved that only the standard library can handle correctly." If that's not an acceptable answer to this question, and the job description isn't implementing the standard library, you probably don't want to work there. — zwol, Feb 12 '17 at 19:39
Do you mean how would you write something like std::mutex, or how would you use it? — , Feb 12 '17 at 19:40
https://stackoverflow.com/questions/5095781/how-pthread-mutex-lock-is-implemented — Josh Lee, Feb 12 '17 at 19:40
In general you can't code an efficient mutex by yourself. Typically you would need an operating system support to do that. The primitives that are provided by programming libraries are a wrappers for low level system specific APIs. — ciechowoj, Feb 12 '17 at 19:42
Anyone have an idea what sort of response they were looking for then? I feel like the fact that two separate companies asked me the same question means they expect a better answer than "nah, just use stl". — ZtoYi, Feb 12 '17 at 19:49
The charitable explanation is that they were looking for "this is something you _need_ to let the runtime do for you" - they want to be assured that you understand that there are such things and this is one of them. The less charitable explanation is that they are bad at interviewing. Unfortunately, many companies are very bad indeed at interviewing. — zwol, Feb 12 '17 at 19:51
Maybe they are looking for something like this: https://en.wikipedia.org/wiki/Test-and-set#Mutual_Exclusion_using_test-and-set compare_and_swap can also be used to implement a mutex: https://en.wikipedia.org/wiki/Compare-and-swap http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange — xandoid, Feb 12 '17 at 20:24

score 0 · Answer 1 · answered Feb 12 '17 at 20:49

There's lots of ways to implement a mutex lock, but it typically starts with the basic premise that the cpu architecture offers some concept of atomic add and atomic subtract. That is, an addition operation can be done to an integer variable in memory (and return the result) without being corrupted by another thread attempting to access same memory location. Or at the very least, "atomic increment" and "atomic decrement".

On modern Intel chips, for example, there's an instruction called XADD. When combined with the LOCK prefix it executes atomically and invalidates cached values across other cores. gcc implements a wrapper for this instruction called __sync_add_and_fetch. Win32 implements a similar function called InterlockedIncrement. Both are just calling LOCK XADD under the hood. Other CPU architectures should offer something similar.

So the most basic mutex lock could be implemented something like this. This is often called a "spin" lock. And this cheap version offers no ability to recursively enter the lock.

// A correct, but poorly performant mutex implementation

void EnterLock(int* lock)
{

    while (true)
    {
        int result = LOCK_XADD(lock,1); // increment the value in lock and return the result atomically

        if (result == 1)
        {
           // if the value in lock was successfully incremented 
           // from 0 to 1 by this thread. It means this thread "acquired" the lock
           return;
        }

        LOCK XADD(lock,-1); // we didn't get the lock - decrement it atmoically back to what it was
        sleep(0); // give the thread quantum back before trying again
    }

}

void LeaveLock(int* lock)
{
    LOCK XADD(lock,-1); // release the lock. Assumes we successfully acquired it correctly with EnterLock above
}

The above suffers from poor performance of "spinning" and doesn't guarantee any fairness. A higher priority thread could continue to win the EnterLock battle over a lower priority thread. And the programmer could make a mistake and call LeaveLock with with a thread that did not previously call EnterLock. You could expand the above to operate on a data structure that not only includes the lock integer, but also has record keeping for the owner thread id and a recursion count.

The second concept for implementing a mutex is that the operating system can offer a wait and notify service such that a thread doesn't have to spin until the owner thread has released it. The thread or process waiting on lock can register itself with the OS to be put to sleep until the owner thread has released it. In OS terms, this is called a semaphore. Additionally, the OS level semaphore can also be used to implement locks across different processes and for the cases where the CPU doesn't offer an atomic add. And can be used to guaranteed fairness between multiple threads trying to acquire the lock.

Most implementations will try spinning for multiple attempts before falling back to making a system call.

score 0 · Answer 2 · answered Feb 12 '17 at 21:09

I wouldn't say that this is a stupid question. On any level of abstraction for the position. On the high level you just say, you use standard library, or any threading library. If you apply for a position as the compiler developer you need to understand how it acutally works and what is needed for the implementation.

To implement a mutex, you need a locking mechanism, that is you need to have a resource that can be marked as taken across all threads. This is not trivial. You need to remember that two cores share memory, but they have caches. This piece of information must be guaranteed to be actual. So you do need support for hardware to ensure atomicity.

If you take at implementation of clang, they offload (at least in once case) implementation to pthreads, typedefs in threading support:

#if defined(_LIBCPP_HAS_THREAD_API_PTHREAD)
# include <pthread.h>
# include <sched.h>
#elif defined(_LIBCPP_HAS_THREAD_API_WIN32)
#include <Windows.h>
#include <process.h>
#include <fibersapi.h>
#endif

And if you dig through pthreads repo, you can find asm implementations of the interlocking operations. They rely on the lock asm keyword which make the operations atomic, i.e. no other thread can execute them at the same time. This eliminates racing conditions, and guarantees coherency.

Based on this, you can build a lock, which you can use for a mutex implementation.

InterviewQ: How do you code a mutex?

2 Answers2