Understanding Condition Variables - Isn't there always spin-waiting "somewhere"?

Question

I am going to explain my understanding of this OS construct and appreciate some polite correction.

I understand thread-safety clearly and simply.

If there is some setup where

X: some condition
Y: do something

and

if X
   do Y

is atomic, meaning that if at the exact moment in time

doing Y
not X

there is some problem.

By my understanding, the lowest-level solution of this is to use shared objects (mutexes). As an example, in the solution to the "Too Much Milk" Problem

    Thead A     |     Thread B
-------------------------------------
leave Note A    |  leave Note B
while Note B    |  if no Note A 
  do nothing    |    if no milk
if no milk      |      buy milk
  buy milk      |  remove Note B
remove Note A   |

Note A and Note B would be the shared objects, i.e. some piece of memory accessible by both threads A and B.

This is can be generalized (beyond milk) for 2-thread case like

    Thead A     |     Thread B
-------------------------------------
leave Note A    |  leave Note B
while Note B    |  if no Note A 
  do nothing    |    if X
if X            |      do Y
  do Y          |  remove Note B
remove Note A   |

and there is some way to generalize it for the N-thread case (so I'll continue referring to the 2-thread case for simplicity).

Possibly incorrect assumption #1: This is the lowest-level solution known (possible?).

Now one of the defficiencies of this solution is the spinning or busy-wait

while Note B
  do nothing

because if the do Y is an expensive task then the thread scheduler will keep switching to Thread A to perform this check, i.e. the thread is still "awake" and using processing power even when we "know" its processing is to perform a check that will fail for some time.

The question then becomes: Is there some way we could make Thread A "sleep", so that it isn't scheduled to run until Note B is gone, and then "wake up"???

The Condition Variable design pattern provides a solution and it built on top of mutexes.

Possibly incorrect assumption #2: Then, isn't there still some spinning under the hood? Is the average amount of spinning somehow reduced?

I could use a logical explanation like only S.O. can provide ;)

Solomon Slow · Answer 1 · 2020-12-04T18:45:12.750

Isn't there still some spinning under the hood.

No. That's the whole point of condition variables: It's to avoid the need for spinning.

An operating system scheduler creates a private object to represent each thread, and it keeps these objects in containers which, for purpose of this discussion, we will call queues.

Simplistic explanation:

When a thread calls condition.await(), that invokes a system call. The scheduler handles it by removing the calling thread from whatever CPU it was running on, and by putting its proxy object into a queue. Specifically, it puts it into the queue of threads that are waiting to be notified about that particular condition.

There usually is a separate queue for every different thing that a thread could wait for. If you create a mutex, the OS creates a queue of threads that are waiting to acquire the mutex. If you create a condition variable, the OS creates a queue of threads that are waiting to be notified.

Once the thread's proxy object is in that queue, nothing will wake it up until some other thread notifies the condition variable. That notification also is a system call. The OS handles it (simplest case) by moving all of the threads that were in the condition variable's queue into the global run queue. The run queue holds all of the threads that are waiting for a CPU to run on.

On some future timer tick, the OS will pick the formerly waiting thread from the run queue and set it up on a CPU.

Extra credit:

Bad News! the first thing the thread does after being awakened, while it's still inside the condition.await() call, is it tries to re-lock the mutex. But there's a chance that the thread that signalled the condition still has the mutex locked. Our victim is going to go right back to sleep again, this time, waiting in the queue for the mutex.

A more sophisticated system might be able to optimize the situation by moving the thread directly from the condition variable's queue to the mutex queue without ever needing to wake it up and then put it back to sleep.

But doesn't this queue need to be a thread-safe queue and therefore require a lock of its own? — Brocolli Rob, Dec 05 '20 at 15:38
Isn't this answer wrong, then? "Note that to use a conditional variable, two other elements are needed: * a condition (typically implemented by checking a flag or a counter) * a mutex that protects the condition" https://stackoverflow.com/questions/3513045/conditional-variable-vs-semaphore — Brocolli Rob, Dec 05 '20 at 16:26
@BrocolliRob, It's been a long time since I actually worked in the code of an operating system—since before multi-CPU systems were a thing. I don't know how they coordinate kernel code between different CPUs, but in general, the rules that kernel code must obey are different from the rules that application code must obey. In the systems that I worked on, there could be many application processes, each with potentially many threads, but the kernel code itself effectively was single-threaded. — Solomon Slow, Dec 07 '20 at 14:05

score 0 · Answer 2 · answered Dec 05 '20 at 01:33

0

yes, on the lowest, hardware level instructions like Compare-and-set, Compare-and-swap are used, which spin until the condition is met, and only then make set (assignment). This spin is required each time we put a thread in a queue, be it queue to a mutex, to condition or to processor.

answered Dec 05 '20 at 01:33

Alexei Kaigorodov

13,189
1
21
38

Technically, yes. However, that is unlikely to spin for more than a half a dozen instructions unless there is **extreme** contention on the condition involving multiple cores. – Stephen C Dec 05 '20 at 01:52
yes, everything depends on contention. And extreme contention happen often for inexperienced programmers. Most popular question at SO is ”why my parallel program works slower than sequentional” - the answer is because synchronization operations take more than computational. – Alexei Kaigorodov Dec 05 '20 at 03:23

score 0 · Answer 3 · answered Dec 05 '20 at 01:54

Then, isn't there still some spinning under the hood? Is the average amount of spinning somehow reduced?

That's a decision for the implementation to make. If spinning works best on the platform, then spinning can be used. But almost no spinning is required.

Typically, there's a lock somewhere at the lowest level of the implementation that protects system state. That lock is only held by any thread for a tiny split second as it manipulates that system state. Typically, you do need to spin while waiting for that inner lock.

A block on a mutex might look like this:

Atomically try to acquire the mutex.
If that succeeds, stop, you are done. (This is the "fast path".)
Acquire the inner lock that no thread holds for more than a few instructions.
Mark yourself as waiting for that mutex to be acquired.
Atomically release the inner lock and set your thread as not ready-to-run.

Notice the only place that there is any spinning in here is in step 3. That's not in the fast path. No spinning is needed after the call in step 5 does not return to this thread until the lock is conveyed to this thread by the thread that held it.

When a thread releases the lock, it checks the count of threads waiting for the lock. If that's greater than zero, instead of releasing the lock, it acquires the inner lock protecting system state, picks one of the threads recorded as waiting for the lock, conveys the lock to that thread, and tells the scheduler to run that thread. That thread then sees step 5 return from its call with now holding the lock.

Again, the only waiting is on that inner lock that is used just to track what thread is waiting for what.

Understanding Condition Variables - Isn't there always spin-waiting "somewhere"?

3 Answers3