Imagine if you need a thread to wait for another thread to do something that it may or may not even currently be actively working on. For example, a thread that's waiting for a job to do may need to wait until another thread has put a job on the list of jobs it should do if that list is empty. How would you do this?
You can't just use some form of mutual exclusion. There may be long periods of time when there's no work to do and not thread holds any lock on the queue. There may just not be any work to do right now. The thread that does work needs to wait, without holding any lock, until another thread has given it some work to do.
So somewhere, there's a thread that does something like this:
- Acquire the lock that protects some shared state that another thread might be waiting for a change to. (In this case, the job queue.)
- Change the shared state to reflect the fact that the thing a thread might need to wait for has happened. (That is, put a job on the queue.)
- Release the lock and let any waiting thread(s) know that the thing has happened.
So what could our code to wait look like? Perhaps:
- Acquire the lock that protects the shared state.
- Check whether we need to wait or not. (Is there a job on the queue?)
- If we need to wait, wait. (If not, wait for a job to be placed on the queue.)
...
Oops, we have a problem. The thing we're waiting for can't happen because we hold the lock. No other thread can change the shared state. (Our thread to put a job on the queue can't touch the queue until we release the lock we acquired in step 1.)
Let's try it again:
- Acquire the lock that protects the shared state.
- Check whether we need to wait or not. (Is there a job on the queue?)
- If we don't need to wait, exit this algorithm. (If there's a job, take it off the queue, release the lock, and do it.)
- Release the lock. (So another thread can put a job on the queue.)
- Wait for the thing to happen.
...
Oops, we have another problem. What if the thing we're waiting for happens after step 4 but before step 5. Since the lock has been released, the thing we're waiting for can happen. We can't check again because we don't hold the lock. How can we ensure we don't wait for something that has already happened, which may mean waiting forever?
To solve this, we need an atomic "unlock and wait" operation. That's what wait
does. And we also need some operation that can end this wait that can be called by the thread that changed the shared state so that we no longer need to wait. That's what notify
does.