Why is there no std:: equivalent to pthread_spinlock_t like there is for pthread_mutex_t & std::mutex?

Question

I've used pthreads a fair bit for concurrent programs, mainly utilising spinlocks, mutexes, and condition variables.

I started looking into multithreading using std::thread and using std::mutex, and I noticed that there doesn't seem to be an equivalent to spinlock in pthreads.

Anyone know why this is?

Look at the example for `std::atomic_flag` at [cppreference](https://en.cppreference.com/w/cpp/atomic/atomic_flag) that might answer your question. — 463035818_is_not_an_ai, Apr 30 '20 at 15:18
@idclev463035818 That example is often criticised for being naive. 2 common optimizations are: `pause` instruction between retries and speculative loads on failure before retrying CAS. — Maxim Egorushkin, Apr 30 '20 at 15:30
@MaximEgorushkin good to know. It was just the first I found and it gave the impression that a spinlocked can be implemented easily — 463035818_is_not_an_ai, Apr 30 '20 at 15:33
Someone correct me if I'm wrong, but I'm under the impression that modern tuning makes it so that in a 0-contention scenario, mutexes and spinlocks are not THAT different performance wise. And that the substantial difference lies in the "very short contention windows" scenarios, which is not exactly common. In the cases where you really want that slim margin in the 0-contention case, you'll probably want something extremely fined-tuned anyways. — , Apr 30 '20 at 15:34
Spinlocks in user space code are generally considered a Bad Idea. — Shawn, Apr 30 '20 at 17:14
@Shawn: Is the standard library only for "user space code", or should we be able to implement libraries on top of it? — Nicol Bolas, Apr 30 '20 at 17:43
@NicolBolas There aren't that many OS kernels written in C++, and I'd be surprised if they use much of the C++ standard library. — Shawn, Apr 30 '20 at 18:03
@curiousguy See the answers. Basically, because the kernel isn't aware of them, you run into scheduling issues and potential poor performance. — Shawn, May 01 '20 at 02:22

Maxim Egorushkin · Accepted Answer · 2020-05-01T16:42:40.570

there doesn't seem to be an equivalent to spinlock in pthreads.

Spinlocks are often considered a wrong tool in user-space because there is no way to disable thread preemption while the spinlock is held (unlike in kernel). So that a thread can acquire a spinlock and then get preempted, causing all other threads trying to acquire the spinlock to spin unnecessarily (and if those threads are of higher priority that may cause a deadlock (threads waiting for I/O may get a priority boost on wake up)). This reasoning also applies to all lockless data structures, unless the data structure is truly wait-free (there aren't many practically useful ones, apart from boost::spsc_queue).

In kernel, a thread that has locked a spinlock cannot be preempted or interrupted before it releases the spinlock. And that is why spinlocks are appropriate there (when RCU cannot be used).

On Linux, one can prevent preemption (not sure if completely, but there has been recent kernel changes towards such a desirable effect) by using isolated CPU cores and FIFO real-time threads pinned to those isolated cores. But that requires a deliberate kernel/machine configuration and an application designed to take advantage of that configuration. Nevertheless, people do use such a setup for business-critical applications along with lockless (but not wait-free) data structures in user-space.

On Linux, there is adaptive mutex PTHREAD_MUTEX_ADAPTIVE_NP, which spins for a limited number of iterations before blocking in the kernel (similar to InitializeCriticalSectionAndSpinCount). However, that mutex cannot be used through std::mutex interface because there is no option to customise non-portable pthread_mutexattr_t before initialising pthread_mutex_t.

One can neither enable process-sharing, robostness, error-checking or priority-inversion prevention through std::mutex interface. In practice, people write their own wrappers of pthread_mutex_t which allows to set desirable mutex attributes; along with a corresponding wrapper for condition variables. Standard locks like std::unique_lock and std::lock_guard can be reused.

IMO, there could be provisions to set desirable mutex and condition variable properties in std:: APIs, like providing a protected constructor for derived classes that would initialize that native_handle, but there aren't any. That native_handle looks like a good idea to do platform specific stuff, however, there must be a constructor for the derived class to be able to initialize it appropriately. After the mutex or condition variable is initialized that native_handle is pretty much useless. Unless the idea was only to be able to pass that native_handle to (C language) APIs that expect a pointer or reference to an initialized pthread_mutex_t.

There is another example of Boost/C++ standard not accepting semaphores on the basis that they are too much of a rope to hang oneself, and that mutex (a binary semaphore, essentially) and condition variable are more fundamental and more flexible synchronisation primitives, out of which a semaphore can be built.

From the point of view of the C++ standard those are probably right decisions because educating users to use spinlocks and semaphores correctly with all the nuances is a difficult task. Whereas advanced users can whip out a wrapper for pthread_spinlock_t with little effort.

C++20 has added `counting_semaphore` and `binary_semaphore` to the standard. I am quite sure, that a future version will also add spinlocks, as they are very useful in protecting small code segments, that only update two or three values. — Kai Petzke, Apr 12 '22 at 16:41

greywolf82 · Answer 2 · 2020-04-30T18:53:02.983

2

You are right there's no spin lock implementation in the std namespace. A spin lock is a great concept but in user space is generally quite poor. OS doesn't know your process wants to spin and usually you can have worse results than using a mutex. To be noted that on several platforms there's the optimistic spinning implemented so a mutex can do a really good job. In addition adjusting the time to "pause" between each loop iteration can be not trivial and portable and a fine tuning is required. TL;DR don't use a spinlock in user space unless you are really really sure about what you are doing.

C++ Thread discussion

Article explaining how to write a spin lock with benchmark

Reply by Linus Torvalds about the above article explaining why it's a bad idea

edited Apr 30 '20 at 18:53

answered Apr 30 '20 at 17:26

greywolf82

21,813
18
54
108

1

"*OS doesn't know your process wants to spin*" That's not a bug; that's a *feature*: keeping the OS from stealing your timeslice. If you need millisecond precision, minimizing timeslice theft is really important. And lock-free coding requires having "mutexes" that keep the OS's grubby little hands off of my CPU. You use spinlocks in places where the lock won't need to be maintained for a significant period of time or where there won't be much contention (task queues, etc). That a tool can be misused is not a good reason to disallow the tool itself. – Nicol Bolas Apr 30 '20 at 17:42
1

@NicolBolas Did you read Linus's post that was linked? He brings up the scheduling thing... – Shawn Apr 30 '20 at 18:08
4

@NicolBolas: A spinlock may be necessary but is not sufficient for that kind of work, is the thing. Using a spinlock is *not* going to prevent the scheduler from terminating your timeslice, it's grubby hands will intervene anyway. You have to additionally ensure the spinning thread (and the thread currently holding the lock) both remain scheduled, which requires more guarantees than the typical runtime C++ has. I think it's still a reasonable question of if C++ should provide these utilities anyway and let the developer decide, but generally it would not be the right choice. – GManNickG Apr 30 '20 at 18:48
@GManNickG: "*I think it's still a reasonable question of if C++ should provide these utilities anyway and let the developer decide*" I'm fairly sure the decision [has already been made](https://en.cppreference.com/w/cpp/thread/counting_semaphore). – Nicol Bolas Apr 30 '20 at 20:19
3

@NicolBolas: Unlike a spinlock, sephamores have reasonable "default" uses. You can pick a mutex or sephamore off the shelf and use its API to do the right things. You cannot pick a spinlock off a shelf and expect it to do the right thing. (Where "right" here includes more than just literally providing exclusion, but also some sense of quality and usefulness.) It wouldn't surprise me or go against any sort of expectation if C++ included a spinlock, but I imagine in response we would find many SO questions of the form "so I used a spinlock..." answered with "well, use mutex instead". :) – GManNickG Apr 30 '20 at 20:32

score 0 · Answer 3 · answered Apr 12 '22 at 18:01

Spin locks have two advantages:

They require much fewer storage as a std::mutex, because they do not need a queue of threads waiting for the lock. On my system, sizeof(pthread_spinlock_t) is 4, while sizeof(std::mutex) is 40.
They are much more performant than std::mutex, if the protected code region is small and the contention level is low to moderate.

On the downside, a poorly implemented spin lock can hog the CPU. For example, a tight loop with a compare-and-set assembler instructions will spam the cache system with loads and loads of unnecessary writes. But that's what we have libraries for, that they implement best practice and avoid common pitfalls. That most user implementations of spin locks are poor, is not a reason to not put spin locks into the library. Rather, it is a reason to put it there, to stop users from trying it themselves.

There is a second problem, that arises from the scheduler: If thread A acquires the lock and then gets preempted by the scheduler before it finishes executing the critical section, another thread B could spin "forever" (or at least for many milliseconds, before thread A gets scheduled again) on that lock.

Unfortunately, there is no way, how userland code can tell the kernel "please don't preempt me in this critical code section". But if we know, that under normal circumstances, the critical code section executes within 10 ns, we could at least tell thread B: "preempt yourself voluntarily, if you have been spinning for over 30 ns". This is not guaranteed to return control directly back to thread A. But it will stop the waste of CPU cycles, that otherwise would take place. And in most scenarios, where thread A and B run in the same process at the same priority, the scheduler will usually schedule thread A before thread B, if B called std::this_thread::yield().

So, I am thinking about a template spin lock class, that takes a single unsigned integer as a parameter, which is the number of memory reads in the critical section. This parameter is then used in the library to calculate the appropriate number of spins, before a yield() is performed. With a zero count, yield() would never be called.

Why is there no std:: equivalent to pthread_spinlock_t like there is for pthread_mutex_t & std::mutex?

3 Answers3