12

This article: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf (page 12) seems to make a difference between a lock and a memory barrier

I would like to know what the difference is between a lock, memory barrier, and a semaphore?

(While other questions might mention the difference between a lock and a synchronisation object, I found none about the difference between a lock and a memory barrier)

xcrypt
  • 3,276
  • 5
  • 39
  • 63

3 Answers3

15

A memory barrier (also known as a fence) is a hardware operation, which ensures the ordering of different reads and writes to the globally visible store. On a typical modern processor, memory accesses are pipelined, and may occur out of order. A memory barrier ensures that this doesn't happen. A full memory barrier will ensure that all loads and stores which precede it occur before any load or store which follows it. (Many processors have support partial barriers; e.g. on a Sparc, a membar #StoreStore ensures that all stores which occur before it will be visible to all other processes before any store which occurs after it.)

That's all a memory barrier does. It doesn't block the thread, or anything.

Mutexes and semaphores are higher level primatives, implemented in the operating system. A thread which requests a mutex lock will block, and have its execution suspended by the OS, until that mutex is free. The kernel code in the OS will contain memory barrier instructions in order to implement a mutex, but it does much more; a memory barrier instruction will suspend the hardware execution (all threads) until the necessary conditions have been met—a microsecond or so at the most, and the entire processor stops for this time. When you try to lock a mutex, and another thread already has it, the OS will suspend your thread (and only your thread—the processor continues to execute other threads) until whoever holds the mutex frees it, which could be seconds, minutes or even days. (Of course, if it's more than a few hundred milliseconds, it's probably a bug.)

Finally, there's not really much difference between semaphores and mutexes; a mutex can be considered a semaphore with a count of one.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • 1
    http://stackoverflow.com/questions/10536309/release-mode-error-but-not-in-debug-mode/10537061#comment13649416_10537061 There is also someone that told me that I have to use a memory barrier in order so this variable would not be cached. But I don't see what a memory barrier (ensures correct order of execution) might help with in this case. Only the use of volatile to clarify the compiler it's value can change at any time, so don't optimise it away. And also semaphores/mutex so `m_NumRunningThreads==0` won't trigger more than once. Am I correct? – xcrypt May 11 '12 at 14:31
  • 1
    @xcrypt Caching is more or less a red herring, unless you use an extended meaning of caching. Processors synchronize caches; they don't synchronize data in the read and write pipelines unless you request it. In practice, `volatile` doesn't guarantee anything, since we're concerned about reordering by the hardware, which occurs at execution time. And of course, any technique you use to introduce the memory barriers will (or should) prevent the compiler from moving the accesses around it. – James Kanze May 11 '12 at 14:41
  • 1
    while volatile is no replacement for memory barriers or locks, it can be used to make the compiler not optimize certain parts away according to various posts and articles (indicates that the value of the variable can change at any time (as well as some other stuff)) – xcrypt May 11 '12 at 14:45
  • 1
    @xcrypt Except that the fact that the compiler doesn't optimize accesses to that one particular variable doesn't stop the hardware from reordering the loads and stores, so `volatile` is not sufficient. And anything you might to which would stop the hardware reordering will also stop the compiler from reordering, so `volatile` isn't necessary. – James Kanze May 11 '12 at 14:49
  • 1
    except when I don't want to stop the hardware from reordering? I don't see why (in this case) it is necessary. In DCLP for example is obviously necessary but not in my problem. Also it's not the reordering that concerns me it's the fact that m_bIsReady won't be updated because it might get optimised away, due to the compiler thinking that it's value is non-volatile. – xcrypt May 11 '12 at 14:55
  • 1
    @xcrypt: if you put a lock around m_bIsReady, then it is guaranteed to run as expected. If this affects performance, then you should first see whether the whole m_IsReady checking is needed. It might be the case that running things in a different thread without checking and waiting on a condition variable turns out to be faster. – stefaanv May 11 '12 at 15:11
  • 1
    @stefaanv it is needed because while doing those loadings in seperate threads, I must render the loadingscreen in the main thrd. When this process is completed (m_bIsReady), the program can continue to it's next task. I still don't understand why I would need a lock though, and that's what's really bothering me. There's only one thread that will get there at any time, and even if multiple threads would get there, it's doesn't seem a problem because they will all set it to true, thus causing the program to enable to go to it's next state. I would understand why I would need volatile, but lock? – xcrypt May 11 '12 at 15:22
  • 1
    @stefaanv even if this process would happen out of order, in the end m_bIsReady would still be true, so memory barrier also escapes me. – xcrypt May 11 '12 at 15:23
  • No worries, if you feel it's overhead, don't do it, you can always add it if needed. – stefaanv May 11 '12 at 18:34
  • 1
    @xcrypt When would you want to stop hardware reordering? (And of course, according to Posix, its undefined behavior anyway, if you do nothing.) – James Kanze May 14 '12 at 08:04
  • 1
    @JamesKanze for my specific case (see code in the link I gave) I don't need to stop the hardware from reordering. It is simply not necessary because even with reordering I will get the expected result. I was able to fix my code by changing `bool m_bIsReady` into `volatile bool m_bIsReady` Probably, this told my compiler not to optimise the while loop away. At least that's what I think after 2 days of researching the volatile keyword O.o – xcrypt May 14 '12 at 15:29
  • 1
    @xcrypt So you go from one symptom of undefined behavior to another. If you modify a variable in one thread, and access it in any way in another thread, you need something (at least according to Posix). – James Kanze May 14 '12 at 18:20
  • 1
    @JamesKanze could you link me to the place where I can find that information? Btw, I'm only accessing it from the main thread, not sure if that makes any difference. – xcrypt May 14 '12 at 20:45
  • 1
    @xcrypt It's in §4.10 of the Posix standard: "Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it. Such access is restricted using functions that synchronize thread execution and also synchronize memory with respect to other threads." Use of membar, etc., aren't specified in Posix, but offer additional means of synchronizing memory, so can be used to the same effect. – James Kanze May 15 '12 at 07:26
14
  • A memory barrier is a method to order memory access. Compilers and CPU's can change this order to optimize, but in multithreaded environments, this can be an issue. The main difference with the others is that threads are not stopped by this.
  • A lock or mutex makes sure that code can only be accessed by 1 thread. Within this section, you can view the environment as singlethreaded, so memory barriers should not be needed.
  • a semaphore is basically a counter that can be increased (v()) or decreased (p()). If the counter is 0, then p() halts the thread until the counter is no longer 0. This is a way to synchronize threads, but I would prefer using mutexes or condition variables (controversial, but that's my opinion). When the initial counter is 1, then the semaphore is called a binary semaphore and it is similar to a lock.

A big difference between locks and semaphores is that the thread owns the lock, so no other thread should try to unlock, while this is not the case for semaphores.

stefaanv
  • 14,072
  • 2
  • 31
  • 53
  • 1
    Wait a second, I have been told a mutex was some kind of semaphore? – xcrypt May 11 '12 at 13:47
  • 1
    If you already know semaphores and mutexes are new, then you could regard mutexes as a sort of binary semaphores, but they are not exactly the same (see also wikipedia links). However if you already know mutexes and condition variables and you learn about semaphores, then you could consider semaphores to be a condition variable with a counter and a condition that the thread should wait when trying to make the counter negative. – stefaanv May 11 '12 at 13:58
  • 1
    @xcrypt: Yes, "mutex" is short for "mutual exclusion semaphore". – Jerry Coffin May 11 '12 at 14:16
  • 1
    @JerryCoffin: I don't fully agree, even if this is the official name, it is only historically because semaphores are older, but I have also seen "mutual exclusion semaphore" used to mean binary semaphores. The main difference is that semaphores have a counter and mutexes don't, so semantically, mutexes lock, but the behavior of semaphores depend on their counter. (BTW, have you tried to use a semaphore with a condition variable instead of a mutex? I haven't, but I guess it doesn't work.) – stefaanv May 11 '12 at 14:35
  • Lock and mutex contain a memory barrier as mentioned in James Kanze's answer. This is also mentioned here https://stackoverflow.com/a/28771322/4692662 – Steve Chavez Sep 03 '22 at 22:43
2

Simples explanation for now.

Lock

Is an atomic test for whether this piece of code can proceed

lock (myObject)
{
    // Stuff to do when I acquire the lock
}

This is normally a single CPU instruction that tests and sets a variable as a single atomic operation. More here, http://en.wikipedia.org/wiki/Test-and-set#Hardware_implementation_of_test-and-set_2

Memory Barrier

Is a hint to the processor that it can not execute these instructions out of order. Without it, the instructions could be executed out of order, like in double checked locking, the two null checks could be executed before the lock has.

Thread.MemoryBarrier();
public static Singleton Instance()
{
    if (_singletonInstance == null)
    {
        lock(myObject)
        {
            if (_singletonInstance == null)
            {
                _singletonInstance = new Singleton();
            }
        }
    }
}

This are also a set of CPU instructions that implement memory barriers to explicitly tell a CPU it can not execute things out of order.

Semaphores

Are similar to locks except they normally are for more than one thread. I.e. if you can handle 10 concurrent disk reads for example, you'd use a semaphore. Depending on the processor this is either its own instruction, or a test and set instruction with load more work around interrupts (for e.g. on ARM).

M Afifi
  • 4,645
  • 2
  • 28
  • 48