9

So I just found out that it's legal to signal a condition variable if you're not holding the lock in c++11. That seems to open the door to some nasty race condition:

std::mutex m_mutex;
std::condition_variable m_cv;

T1: 
  std::unique_lock<std::mutex> lock(m_mutex);
  m_cv.wait(lock, []{ return !is_empty(); });

T2:
  generate_data();
  m_cv.notify();

Is it guaranteed that T1 will never end up in a situation where we check is_empty() first (it returning true), then getting preempted by T2 which creates some data and signals the condition variable before we can actually wait on it?

If this is guaranteed to work (I'd guess so, otherwise it would seem like an intentionally bad API design), how is this actually implemented for say linux and stdlibc++? Seems we'd need another lock to avoid this situation.

Voo
  • 29,040
  • 11
  • 82
  • 156

2 Answers2

12

Checking the predicate and waiting are not performed atomically in std::condition_variable::wait (unlocking the lock and sleeping are performed atomically). If it is possible for another thread to change the value of the predicate while this thread holds the mutex, then it is possible for notifications to occur between the predicate check and going to sleep, and effectively be lost.

In your example, if generate_data() in T2 can alter the result of is_empty() without holding m_mutex, it's possible for a notification to happen between T1 checking is_empty() and sleeping on m_cv. Holding the mutex at any time between the change to the predicate and the notification is sufficient to guarantee the atomicity of the predicate check and wait call in the other thread. That could look like:

{
  std::lock_guard<std::mutex> lk(m_mutex);
  generate_data();
}
m_cv.notify();

or even

generate_data();
std::lock_guard<std::mutex>(m_mutex); // Lock the mutex and drop it immediately
m_cv.notify();
Casey
  • 41,449
  • 7
  • 95
  • 125
  • I was a bit sloppy with the formulation here and assumed everybody would be familiar with the general idiom where you lock data publishing and notifying. But you're right that just locking the actual publishing is enough, which I guess is a good reason in some situations (I doubt though that it makes any difference for the shown code whether notify is in the lock too or outside) to allow this scenario. – Voo Jan 29 '14 at 19:00
  • @Voo Notifying inside vs. outside the lock doesn't affect correctness, notifying inside the lock wastes a bit of performance if the waiting thread (or several threads in the case of `notify_all`) wakes up only to immediately block on the mutex. – Casey Jan 29 '14 at 20:35
  • @Casey: Why will your first solution not work? i.e. taking a lock before making the call to notify()? If T2 gets the lock, T1 won't enter the wait, and when it does, the predicate is true, and hence it doesn't wait which is the expected behavior. – user1715122 Dec 23 '14 at 08:02
  • You are correct - taking the lock any time between `generate_data` and the `notify` assures that the `notify` cannot happen between the other threads check of the predicate and `wait`. – Casey Dec 23 '14 at 17:28
  • So, is "lock and immediate release" makes sure that T1 (consumer) is waiting (on m_cv)? Also consider, if #0) T2 generates data #1) T2 Locks and releases lock #2) T1 gets spurious wake-up #3) Even though notify is not called by T2 yet, T1 uses generated data and exits #4) T2 calls notify (which will be no-op as T1 is not waiting). And this is a false notify as data is already used (empty again). Is this valid scenario and will there be any issue in your code @casey? (I don't think any issue, just want to verify). – Shriganesh Shintre Jul 26 '18 at 16:38
  • Your last scenario seems incorrect. If `generate_data()` is called without holding the mutex, the changes (presumably what `is_empty()` is looking at) are not synchronized. That is undefined behavior. Acquiring and releasing the lock after that does not make it valid. – LWimsey Jan 16 '19 at 23:30
  • @Casey, I have similar question: https://stackoverflow.com/questions/54225720/is-it-necessary-to-use-a-mutex-before-cond-var-notify. It seems that the mutex is actually not needed in case you do not need to transfer the data between threads, right? – o2gy Jan 17 '19 at 11:32
  • @LWimsey I'm assuming that `generate_data` handles any necessary synchronization or requires none since the sample code in the question doesn't depict any locking around the call to `generate_data` in T2. – Casey Jan 18 '19 at 01:01
  • @o2gy If you simply want to wake up any thread that happens to be waiting, the mutex is unnecessary. I can't tell you if that solves your problem without knowing what problem you're solving. – Casey Jan 18 '19 at 01:23
3

It is not guaranteed - if you don't want to miss the signal then you must lock the mutex prior to notifying. Some applications may be agnostic about missing signals.

From man pthread_signal:

The pthread_cond_signal() or pthread_cond_broadcast() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behaviour is required, then that mutex is locked by the thread calling pthread_cond_signal() or pthread_cond_broadcast().

Coral Kashri
  • 3,436
  • 2
  • 10
  • 22
markd
  • 218
  • 1
  • 5
  • Locking the mutex prior to signaling is not sufficient to guarantee that signals cannot be missed; it's necessary to ensure that the value of the condition (the associated predicate) cannot change while the lock is held. – Casey Jan 29 '14 at 18:51
  • 1
    What is the predicate is based on atomic counters? Doesn't it mean that now we have to take a lock before modifying atomic variables thereby defeating the whole purpose of them being atomic? – user1715122 Dec 23 '14 at 08:08