Should condition_variable.notify_all be covered by mutex lock?

Question

I have implemented a class that allows me to synchronize threads with a condition variable. I have found conflicting information as to whether the notify_all should be done within the lock, or outside of the lock. I have found examples constructed both ways.

The argument for releasing the lock first is to prevent the waiting thread from blocking on the mutex right after being released by the notify.

The argument against releasing the lock first is an assertion that notifications could be missed by waiting thread(s).

The two versions of the release function are here:

// version 1 - unlock then notify.
void release(int address = 1)
{
    {
        std::lock_guard<std::mutex> lk(_address_mutex);
        _address = address;
    }
    _cv.notify_all();
}

// version 2 - notify then unlock
void release(int address = 1)
{
    std::lock_guard<std::mutex> lk(_address_mutex);
    _address = address;
    _cv.notify_all();
}

For reference, the wait code looks like this:

bool wait(const std::chrono::microseconds dur, int address = 1)
{
    std::unique_lock<std::mutex> lk(_address_mutex);
    if (_cv.wait_for(lk, dur, [&] {return _address == address; }))
    {
        _address = 0;
        return true;
    }
    return false;
}

Is there a risk of waiting threads missing notifications in version 1, where the mutex is allowed to go out of scope before the notify_all? If so, how does it happen? (It is not obvious to me how this causes missed notifications.)

I can see clearly how keeping the mutex locked during the notify causes the waiting threads to immediately go into a wait. But this is a small price to pay if it prevents missed notifies.

In theory it is more efficient to release the lock before calling *notify* but in practice most compilers optimize away the potential inefficiency that might occur if you don't. — Galik, Feb 24 '18 at 02:17
There is some interesting doscussion here: https://github.com/isocpp/CppCoreGuidelines/issues/925 — Galik, Feb 24 '18 at 02:36
Both is ok, see manual [pthread_cond_broadcast](http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_broadcast.html), and see similar question https://stackoverflow.com/questions/17101922/do-i-have-to-acquire-lock-before-calling-condition-variable-notify-one/ you needn't hold the lock when you broadcast. — superK, Feb 24 '18 at 03:12

score 8 · Accepted Answer · answered Feb 24 '18 at 12:54

8

There is no risk releasing the lock if the mutex was held in some interval between the state of the condition test changing and thr notification.

{
    std::lock_guard<std::mutex> lk(_address_mutex);
    _address = address;
}
_cv.notify_all();

here, the mutex was unlocked after _address was changed. So no risk.

If we modify _address to be atomic, naively this looks correct:

{
    std::lock_guard<std::mutex> lk(_address_mutex);
}
_address = address;
_cv.notify_all();

but it is not; here, the mutex is released for the entire period between modifying the condition test and notification,

_address = address;
{
    std::lock_guard<std::mutex> lk(_address_mutex);
}
_cv.notify_all();

The above, however, becomes correct again (if more than a little strange).

The risk is that the condition test will be evaluated with the mutex active (as false), then changed, then notification sent, then the waiting thread waits on a notification and releases the mutex.

waiting|signalling
lock
test
        test changed
        notification
listen+unlock

the above is an example of a missed notification.

So long as we have the mutex held anywhere after the test change and before the notification it cannot happen.

answered Feb 24 '18 at 12:54

Yakk - Adam Nevraumont

262,606
27
330
524

I edited your post to show some timing examples to see if I understand exactly what you are saying. – ttemple Feb 24 '18 at 18:43
1

@ttemple that is the wrong spot. In any case, you seem to not understand that the mutex is locked when the test is run. The mutex is then unlocked atomically with waiting for a signal (nothing can happen between those two in another thread). – Yakk - Adam Nevraumont Feb 24 '18 at 19:56
So before 'test condition' the mutex will lock? Then release if the condition fails? – ttemple Feb 24 '18 at 20:26
1

@ttemple the mutex is locked while the lambda test is run. If it fails, it atomically releases the mutex and waits on the signal. cpp reference has a good description of how the lambda test works. – Yakk - Adam Nevraumont Feb 24 '18 at 20:34
When (if) the waiting thread is released, it automatically re-locks the mutex, correct? – ttemple Feb 24 '18 at 20:45
@ttemple Except exception triggered release depending on standard version. – Yakk - Adam Nevraumont Feb 25 '18 at 00:06

score 2 · Answer 2 · answered Jul 11 '22 at 18:37

There is a scenario where it is crucial that the lock is held while notify_all is called: when the condition_variable is destroyed after waiting, a spurious wake-up can cause the wait to be ended and the condition_variable destroyed before notify_all is called on the already destroyed object.

Should condition_variable.notify_all be covered by mutex lock?

2 Answers2

Linked