There are several reasons which are compelling (when taken together).
1. The notifier needs to take a lock
Pretend that Condition.notifyUnlocked()
exists.
The standard producer/consumer arrangement requires taking locks on both sides:
def unlocked(qu,cv): # qu is a thread-safe queue
qu.push(make_stuff())
cv.notifyUnlocked()
def consume(qu,cv):
with cv:
while True: # vs. other consumers or spurious wakeups
if qu: break
cv.wait()
x=qu.pop()
use_stuff(x)
This fails because both the push()
and the notifyUnlocked()
can intervene between the if qu:
and the wait()
.
Writing either of
def lockedNotify(qu,cv):
qu.push(make_stuff())
with cv: cv.notify()
def lockedPush(qu,cv):
x=make_stuff() # don't hold the lock here
with cv: qu.push(x)
cv.notifyUnlocked()
works (which is an interesting exercise to demonstrate). The second form has the advantage of removing the requirement that qu
be thread-safe, but it costs no more locks to take it around the call to notify()
as well.
It remains to explain the preference for doing so, especially given that (as you observed) CPython does wake up the notified thread to have it switch to waiting on the mutex (rather than simply moving it to that wait queue).
2. The condition variable itself needs a lock
The Condition
has internal data that must be protected in case of concurrent waits/notifications. (Glancing at the CPython implementation, I see the possibility that two unsynchronized notify()
s could erroneously target the same waiting thread, which could cause reduced throughput or even deadlock.) It could protect that data with a dedicated lock, of course; since we need a user-visible lock already, using that one avoids additional synchronization costs.
3. Multiple wake conditions can need the lock
(Adapted from a comment on the blog post linked below.)
def setSignal(box,cv):
signal=False
with cv:
if not box.val:
box.val=True
signal=True
if signal: cv.notifyUnlocked()
def waitFor(box,v,cv):
v=bool(v) # to use ==
while True:
with cv:
if box.val==v: break
cv.wait()
Suppose box.val
is False
and thread #1 is waiting in waitFor(box,True,cv)
. Thread #2 calls setSignal
; when it releases cv
, #1 is still blocked on the condition. Thread #3 then calls waitFor(box,False,cv)
, finds that box.val
is True
, and waits. Then #2 calls notify()
, waking #3, which is still unsatisfied and blocks again. Now #1 and #3 are both waiting, despite the fact that one of them must have its condition satisfied.
def setTrue(box,cv):
with cv:
if not box.val:
box.val=True
cv.notify()
Now that situation cannot arise: either #3 arrives before the update and never waits, or it arrives during or after the update and has not yet waited, guaranteeing that the notification goes to #1, which returns from waitFor
.
4. The hardware might need a lock
With wait morphing and no GIL (in some alternate or future implementation of Python), the memory ordering (cf. Java's rules) imposed by the lock-release after notify()
and the lock-acquire on return from wait()
might be the only guarantee of the notifying thread's updates being visible to the waiting thread.
5. Real-time systems might need it
Immediately after the POSIX text you quoted we find:
however, if predictable scheduling behavior is required, then that mutex
shall be locked by the thread calling pthread_cond_broadcast() or
pthread_cond_signal().
One blog post contains further discussion of the rationale and history of this recommendation (as well as of some of the other issues here).