1

What is the reason for a notified condition variable to re-lock the mutex after being notified.

The following piece of code deadlock if unique_lock is not scoped or if mutex is not explicitely unlocked

#include <future>
#include <mutex>
#include <iostream>

using namespace std;

int main()
{
    std::mutex mtx;
    std::condition_variable cv;

    //simulate another working thread sending notification
    auto as = std::async([&cv](){   std::this_thread::sleep_for(std::chrono::seconds(2));
                                    cv.notify_all();});

    //uncomment scoping (or unlock below) to prevent deadlock 
    //{

    std::unique_lock<std::mutex> lk(mtx);

    //Spurious Wake-Up Prevention not adressed in this short sample
    //UNLESS it is part of the answer / reason to lock again
    cv.wait(lk);

    //}

    std::cout << "CV notified\n" << std::flush;

    //uncomment unlock (or scoping  above) to prevent deadlock 
    //mtx.unlock();

    mtx.lock();
    //do something
    mtx.unlock();

    std::cout << "End may never be reached\n" << std::flush;

    return 0;
}

Even re-reading some documentation and examples I still do not find this obvious.

Most examples that can be found over the net are small code samples that have inherent scoping of the unique_lock.

Shall we use different mutex to deal with critical sections (mutex 1) and condition variables wait and notify (mutex 2) ?

Note: Debug shows that after end of the waiting phase, the "internal" "mutex count" (I think field __count of structure __pthread_mutex_s ) goes from 1 to 2. It reaches back 0 after unlock

NGI
  • 852
  • 1
  • 12
  • 31
  • https://stackoverflow.com/questions/2763714/why-do-pthreads-condition-variable-functions-require-a-mutex?rq=1 – Zan Lynx Jan 24 '20 at 00:36

3 Answers3

2

You're trying to lock the mutex twice. Once with the unique_lock and again with the explicit mutex.lock() call. For non-recursive mutex, it will deadlock on a re-lock attempt to let you know you have a bug.

std::unique_lock<std::mutex> lk(mtx);   // This locks for the lifetime of the unique_lock object

cv.wait(lk);  // this will unlock while waiting, but relock on return

std::cout << "CV notified\n" << std::flush;

mtx.lock();  // This attempts to lock the mutex again, but will deadlock since unique_lock has already invoked mutex.lock() in its constructor.

The fix is pretty close to what you have with those curly braces uncommented. Just make sure you only have one lock active at a time on the mutex.

Also, as you have it, your code is prone to spurious wake-up. Here's some adjustments for you. You should always stay in the wait loop until the condition or state (usually guarded by the mutex itself) has actually occurred. For a simple notification, a bool will do.

int main()
{
    std::mutex mtx;
    std::condition_variable cv;
    bool conditon = false;

    //simulate another working thread sending notification
    auto as = std::async([&cv, &mtx, &condition](){   
                                    std::this_thread::sleep_for(std::chrono::seconds(2));
                                    mtx.lock();
                                    condition = true;
                                    mtx.unlock();
                                    cv.notify_all();});

    std::unique_lock<std::mutex> lk(mtx); // acquire the mutex lock
    while (!condition)
    {
        cv.wait(lk);
    }

    std::cout << "CV notified\n" << std::flush;

    //do something - while still under the lock

    return 0;
}
selbie
  • 100,020
  • 15
  • 103
  • 173
  • Do you have examples of the need for relocking on return. What would be the consequences of not relocking on return ? – NGI Jan 24 '20 at 00:03
  • 1
    I just updated my answer with some suggestions to your code. When you invoke `cv.wait` it expects that you have passed it a `unque_lock` instance who's constructor has locked the mutex. `cv.wait` will immediately unlock the mutex so as to allow the other thread to acquire the mutex and make changes to program state. When the other thread notifies, the thread blocked on `cv.wait` will attempt to acquire the mutex again before returning. So you can't "not relock". You just need to structure your code such that the lock's lifetime is properly maintained. – selbie Jan 24 '20 at 00:12
  • Thanks for your update. I had shorten the sample code. I usually use this construction or the second prototype of wait() with a predicate as a second argument. So Yes ! Is the need to relock only dictated by the needed test again of condition ( to avoid spurious ) ? Is there any case where you do not care, like atomic condition testing on some platform (ok, not portable) or periodic (non hard timed ) where you just don't care ? – NGI Jan 24 '20 at 00:33
  • @NGI - If you are asking "why does cv.wait() relock on return", it's exactly as you suggested - so you can retest the condition. The predicate version of wait simply invokes the non-predicate version of wait and then runs the predicate code on return (with the lock in effect). – selbie Jan 24 '20 at 01:16
1

Because the condition wait might return for reasons besides being notified such as a signal, or just because someone else wrote onto the same 64-byte cache line. Or it might have been notified but the condition is no longer true because another thread handled it.

So the mutex is locked so that your code can check its condition variable while holding the mutex. Maybe that's just a boolean value saying it's ready to go.

Do NOT skip that part. If you do, you will regret it.

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
1

Let's temporarily imagine that the mutex is not locked on return from wait:

Thread 1:

Locks mutex, checks predicate (whatever that may be), and upon finding the predicate not in an acceptable form, waits for some other thread to put it in an acceptable form. The wait atomically puts thread 1 to sleep and unlocks the mutex. With the mutex unlocked, some other thread will have permission to put the predicate in the acceptable state (the predicate is not naturally thread safe).

Thread 2:

Simultaneously, this thread is trying to lock the mutex and put the predicate in a state that is acceptable for thread 1 to continue past its wait. It must do this with the mutex locked. The mutex protects the predicate from being accessed (either read or written) by more than one thread at a time.

Once thread 2 puts the mutex in an acceptable state, it notifies the condition_variable and unlocks the mutex (the order of these two actions is not relevant to this argument).

Thread 1:

Now thread 1 has been notified and we presume the hypothetical that the mutex isn't locked on return from wait. The first thing thread 1 has to do is check the predicate to see if it is actually acceptable (this could be a spurious wakeup). But it shouldn't check the predicate without the mutex being locked. Otherwise some other thread could change the predicate right after this thread checks it, invalidating the result of that check.

So the very first thing this thread has to do upon waking is lock the mutex, and then check the predicate.

So it is really more of a convenience that the mutex is locked upon return from wait. Otherwise the waiting thread would have to manually lock it 100% of the time.


Let's look again at the events as thread 1 is entering the wait: I said that the sleep and the unlock happen atomically. This is very important. Imagine if thread 1 has to manually unlock the mutex and then call wait: In this hypothetical scenario, thread 1 could unlock the mutex, and then be interrupted while another thread obtains the mutex, changes the predicate, unlocks the mutex and signals the condition_variable, all before thread 1 calls wait. Now thread 1 sleeps forever, because no thread is going to see that the predicate needs changing, and the condition_variable needs signaling.

So it is imperative that the unlock/enter-wait happen atomically. And it makes the API easier to use if the lock/exit-wait also happens atomically.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577