8

Looking at several videos and the documentation example, we unlock the mutex before calling the notify_all(). Will it be better to instead call it after?

The common way:

Inside the Notifier thread:

//prepare data for several worker-threads;

//and now, awaken the threads:
std::unique_lock<std::mutex> lock2(sharedMutex);
_threadsCanAwaken = true;

lock2.unlock(); 
_conditionVar.notify_all(); //awaken all the worker threads;

//wait until all threads completed;

//cleanup:
_threadsCanAwaken = false;

//prepare new batches once again, etc, etc

Inside one of the worker threads:

while(true){
    // wait for the next batch:

    std::unique_lock<std::mutex> lock1(sharedMutex);
    _conditionVar.wait(lock1,  [](){return _threadsCanAwaken});
    lock1.unlock(); //let sibling worker-threads work on their part as well

    //perform the final task

    //signal the notifier that one more thread has completed;

    //loop back and wait until the next task
}

Notice how the lock2 is unlocked before we notify the condition variable - should we instead unlock it after the notify_all() ?

Edit

From my comment below: My concern is that, what if the worker spuriously awakes, sees that the mutex is unlocked, super-quickly completes the task and loops back to the start of while. Now the slow-poke Notifier finally calls notify_all(), causing the worker to loop an additional time (excessive and undesired).

Kari
  • 1,244
  • 1
  • 13
  • 27
  • 1
    The destructor call of `lock2` will automatically unlock the mutex after the `_conditionVar.notify_all();` so you don't need to call it explicitly at all, that's the common idiom IIRC. BTW, don't use a prefix underscore for any of your code, this is reserved for compiler and standard library implementations. – πάντα ῥεῖ Sep 25 '18 at 17:08
  • 1
    Thanks! My concearn is that, what if consumer is super quick. Consumer spruriously awakes, sees that the mutex is unlocked, completes the task and loops back to the start of `while`. Now the slow-poke Producer finally calls `notify_all()`, causing consumer to loop an additional time. – Kari Sep 25 '18 at 17:08

5 Answers5

8

There are no advantages to unlocking the mutex before signaling the condition variable unless your implementation is unusual. There are two disadvantages to unlocking before signaling:

  1. If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.

  2. There is a performance penalty for unlocking before signaling that is avoided by unlocking after signaling. If you signal before you unlock, a good implementation will know that your signal cannot possibly render any thread ready-to-run because the mutex is held by the calling thread and any thread affects by the condition variable necessarily cannot make forward progress without the mutex. This permits a significant optimization (often called "wait morphing") that is not possible if you unlock first.

So signal while holding the lock unless you have some unusual reason to do otherwise.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • 1
    this is a bit at odds with the common advice that signaling before unlocking is a pessimization. Could you elaborate and counter this claim in particular? The 1. that you mention seems like a good safety measure, but I don't understand the performance penalty you mention in 2. I can imagine that signal -> lock can be optimized, but what exactly is the penalty that occurs from unlocking first? – Jake1234 Jun 20 '22 at 23:36
  • 1
    If you unlock first, the unlock can make a thread ready-to-run and the signal can make a thread ready-to-run. If you unlock after signalling, the signal cannot possibly make any thread ready-to-run because any thread blocked on the signal would still need to acquire the mutex to make forward progress and the calling thread has the mutex. This makes the signal operation cheaper because a significant part of what it would otherwise have to do can be skipped. Essentially, the penalty is that the wait morphing optimization cannot take place. – David Schwartz Jun 23 '22 at 18:20
  • 1
    I don't see how the deadlock you mention in 1. is avoided if you signal before unlocking. The scheduler could always decide to start another thread waiting for the same lock, even if it's not waiting on the condition variable. If your program is subject to 1., then you just have a bug, independently on whether you signal then unlock or the other way around. – Giovanni Mascellani Aug 04 '22 at 11:29
  • @GiovanniMascellani If you signal before unlocking the mutex, then the signal can only wake a thread that chose to block before it saw the change the signaling thread made to the shared state. If you unlock the mutex first, you make wake a thread that *just* chose to block on the condition variable after seeing your change, a thread that will definitely just go back to sleep. – David Schwartz Nov 02 '22 at 22:28
  • @Jake1234 I think you are right. The unlock-notify sequence does not require any special optimizations, while the notify-unlock sequence *requires* wait morphing to be efficient. And wait morphing currently does **not** exist on Linux. See my anwer for more info. – Yongwei Wu Dec 02 '22 at 04:01
  • @DavidSchwartz After digesting your 2022-11-02 comment, I understood what deadlock you're talking about. Although specifying a slightly different API, POSIX correctly states that the wait call should be wrapped around a loop that checks for the condition, and wait again **only** if the condition is not satisfied (unless for example error occured, which itself may be considered as part of the condition in some cases). The same recommendations are made elsewhere for resolving such so-called deadlock (I find "sleep paralysis" a better term for condvars). – DannyNiu Jan 30 '23 at 04:23
  • @DannyNiu That doesn't fix the deadlock. You can still have a case where a a thread is blocked on a condition variable, the state changes, another thread blocks on the condition variable, then the condition variable is signaled and the *second* thread wakes. The second thread just goes back to sleep and the first thread never gets woken up. That's the deadlock I'm talking about. It can only occur if you use the same condition variable with more than one predicate (for example, "queue full" and "queue empty") and if you signal rather than broadcast. – David Schwartz Jan 30 '23 at 05:54
  • @DavidSchwartz Yes, that's also the deadlock I'm referring to as sleep paralysis. 1 condvar must be associated with exactly 1 condition (may be complex but only exact 1 condition), and thus with exactly 1 mutex. What I'm objecting, is that signally before or after unlock doesn't matter; it's the testing of condition before calling wait that solves it; the definition of condvar wait and signal and mutex unlock doesn't conflict with each other, and it's perfectly valid to signal 2 different condvars associated with 1 mutex (David Butenhof refer to these "big" mutices in his book on Pthread). – DannyNiu Jan 30 '23 at 10:35
  • @DannyNiu It is perfectly safe to associate one condition variable with more than one condition (for example, "queue full" and "queue empty") so long as you either broadcast the condition variable rather than signalling it or signal while holding the mutex. Only if you do all three things can you deadlock: 1) One c.v. for more than one condition. 2) You signal the c.v. and not broadcast. 3) You signal after releasing the mutex. You just can't do all three of those things. – David Schwartz Jan 30 '23 at 18:39
  • @DavidSchwartz Sure. But for 1) that "queue full" and "queue empty" condition can be combined into "queue *uncontinuable* state change" - reader cannot continue when queue is empty and writer cannot continue when queue is full. For 2) `signal` is a special "optimized" substitute for `broadcast` usable when only 1 other thread uses the CV (that is, `broadcast` should be the default choice). For 3) I'm uncertain, but that feels a bit like cargo cult programming, maybe you can explain how 3) avoids problem caused by 1) and 2) by editing the answer? – DannyNiu Jan 31 '23 at 01:38
  • @DannyNiu 3 avoids the problem because there is no window for the thread to block on the condition variable. If you change the predicate, unlock the mutex, and then signal the condition variable, you might wake a thread that went to sleep after you unlocked the mutex and that thread will go right back to sleep causing a missed wakeup. – David Schwartz Jan 31 '23 at 01:43
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/251508/discussion-between-dannyniu-and-david-schwartz). – DannyNiu Jan 31 '23 at 01:51
3

should we instead unlock it after the notify_all() ?

It is correct to do it either way but you may have different behavior in different situations. It is quite difficult to predict how it will affect performance of your program - I've seen both positive and negative effects for different applications. So it is better you profile your program and make decision on your particular situation based on profiling.

Slava
  • 43,454
  • 1
  • 47
  • 90
  • Did I understand correctly, notify_all() releases the mutex from the currently held lock? To me it's not very clear, because `notify_all()` doesn't accept our current lock as the argument (unlike `wait`). I know that our lock will be destroyed when the local scope ends, but not immediately after `notify_all` ? – Kari Sep 25 '18 at 18:39
  • 1
    No neither `notify_all` nor `notify_one` unlocks the mutex, it is your job either manually or through RAII mechanism. You would just have different behaviour if you unlock before or after notify (I mean by timing). – Slava Sep 25 '18 at 19:12
1

As mentioned here : cppreference.com

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

That said, documentation for wait

At the moment of blocking the thread, the function automatically calls lck.unlock(), allowing other locked threads to continue.

Once notified (explicitly, by some other thread), the function unblocks and calls lck.lock(), leaving lck in the same state as when the function was called. Then the function returns (notice that this last mutex locking may block again the thread before returning).

so when notified wait will re-attempt to gain the lock and in that process it will get blocked again till original notifying thread releases the lock. So I'll suggest that release the lock before calling notify. As done in example on cppreference.com and most importantly

Don't be Pessimistic.

  • The statement at cppreference.com is wrong. The notified thread would *not* immediately block again because it would never become ready-to-run. Any sensible implementation would know that the thread can't make forward progress because the mutex is held by the calling thread. (See my answer.) – David Schwartz Jun 24 '22 at 23:58
  • @DavidSchwartz Except that the "sensible implementation" does not exist? (See my answer.) – Yongwei Wu Dec 02 '22 at 04:03
  • @YongweiWu I wouldn't write code arouind the quirks of any particular implementation, even if it was the only one I was targeting, unless there was some significant demonstrated need to do so. – David Schwartz Dec 02 '22 at 05:24
  • @DavidSchwartz I wouldn't write code around the quirks of some special corner cases. In some cases there is no reason to even lock on the notifier side. Simpler code is better, and it does not pessimize. – Yongwei Wu Dec 02 '22 at 06:22
1

David's answer seems to me wrong.

First, assuming the simple case of two threads, one waiting for the other on a condition variable, unlocking first by the notifier will not waken the other waiting thread, as the signal has not arrived. Then the notify call will immediately waken the waiting thread. You do not need any special optimizations.

On the other hand, signalling first has the potential of waking up a thread and making it sleep immediately again, as it cannot hold the lock—unless wait morphing is implemented.

Wait morphing does not exist in Linux at least, according to the answer under this StackOverflow question: Which OS / platforms implement wait morphing optimization?

The cppreference example also unlocks first before signalling: https://en.cppreference.com/w/cpp/thread/condition_variable/notify_all

It explicit says:

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s). Doing so may be a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock, though some implementations recognize the pattern and do not attempt to wake up the thread that is notified under lock.

Yongwei Wu
  • 5,292
  • 37
  • 49
0

should we instead unlock it after the notify_all() ?

After reading several related posts, I've formed the opinion that it's purely a performance issue. If OS supports "wait morphing", unlock after; otherwise, unlock before.

I'm adding an answer here to augment that of @DavidSchwartz 's. Particularly, I'd like to clarify his point 1.

If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.

  1. The 1st thing I said is that, because it's a CV and not a Mutex, a better term for the so-called "deadlock" might be "sleep paralysis" - a mistake some programs make is that

    • a thread that's supposed to wake

    • went to sleep due to not rechecking the condition it's been waiting for before wait'ng again.

  2. The 2nd thing is that, when waking some other thread(s),

    • the default choice should be broadcast/notify_all (broadcast is the POSIX term, which is equivalent to its C++ counterpart).

    • signal/notify is an optimized special case used for when there's only 1 other thread is waiting.

  3. Finally 3rd, David is adamant that

    • it's better to unlock after notify,

    • because it can avoid the "deadlock" which I've been referring to as "sleep paralysis".

If it's unlock then notify, then there's a window where another thread (let's call this the "wrong" thread) may i.) acquire the mutex, ii.)going into wait, and iii.) wake up. The steps i. ii. and iii. happens too quickly, consumed the signal, leaving the intended (let's call it "correct") thread in sleep.

I discussed this extensively with David, he clarified that only when all 3 points are violated ( 1. condvar associated with several separate conditions and/or didn't check it before waiting again; 2. signal/notify only 1 thread when there're more than 1 other threads using the condvar; 3. unlock before notify creating a window for race condition ), the "sleep paralysis" would occur.

Finally, my recommendation is that, point 1 and 2 are essential for correctness of the program, and fixing issues associated with 1 and 2 should be prioritized over 3, which should only be a augmentative "last resort".

For the purpose of providing reference, manpage for signal/broadcast and wait contains some info from version 3 of Single Unix Specification that gave some explanations on point 1 and 2, and partly 3. Although specified for POSIX/Unix/Linux in C, it's concepts are applicable to C++.

As of this writing (2023-01-31), the 2018 edition of version 4 of Single Unix Specification is released, and the drafting of version 5 is underway.

DannyNiu
  • 1,313
  • 8
  • 27