Why can't thread2 acquire the lock if thread1 acquires the lock repeatedly?

Question

thread1: lock, sleep, unlock; thread2: lock, unlock;

But thread2 can't acquire the lock never, and thread1 acquires the lock repeatedly.

I've tried many times under Ubuntu 16.04(gcc 5.4)

pthread_mutex_t mutex;
pthread_mutex_init(&mutex, NULL);

std::thread t1([&mutex]() {
    while (true) {
        pthread_mutex_lock(&mutex);
        std::cerr << "thread 1" << std::endl;
        sleep(1);
        pthread_mutex_unlock(&mutex);
    }
});

std::thread t2([&mutex]() {
    while (true) {
        pthread_mutex_lock(&mutex);
        std::cerr << "thread 2" << std::endl;
        pthread_mutex_unlock(&mutex);
    }
});

t1.join();
t2.join();

thread 1 thread 1 thread 1 thread 1 thread 1 thread 1 thread 1 ...

Please post a **complete** example that someone else can use to try to reproduce your results. — Andrew Henle, Jul 22 '19 at 09:25
Sounds like a normal starvation issue. Without seeing some realistic [mcve] it's going to be hard to tell why it happens. — Some programmer dude, Jul 22 '19 at 09:26
Also, considering you use`std::thread` and lambdas, why not use the standard C++ locking primitives instead? — Some programmer dude, Jul 22 '19 at 09:26
I don't know what else you expect from this code. There's no reason that thread 1 should wait for thread 2 to acquire and release the lock before immediately relocking it. — trent, Jul 22 '19 at 09:45
try adding a sleep (or a [yield](https://en.cppreference.com/w/cpp/thread/yield)) *after* the unlock, and you should see a difference. — Sander De Dycker, Jul 22 '19 at 09:47
run above codes using a main to wrap it for reproduce: #include #include #include #include #include using namespace std; int main(int argc, char* argv[]) { {{ above codes }} return 0; } — user3015856, Jul 24 '19 at 14:40
@Someprogrammerdude using std::mutex can also reproduce. It doesn't matter. — user3015856, Jul 24 '19 at 14:48
@trentcl Obviously thread1 doesn't need to wait for thread2. But the phenomenon that thread2 can never acquire the lock is also extremely weird. — user3015856, Jul 24 '19 at 14:58
@SanderDeDycker adding a sleep or a yield will lead to different results of course. I know that. I just wanna find out the root reason that causes the abnormal result based on my codes. I think it's related with the schedule mechanism of waking up threads in pthread lib. — user3015856, Jul 24 '19 at 15:02
*But the phenomenon that thread2 can never acquire the lock is also extremely weird* -- It really, really isn't. Think about what would happen if the `sleep` wasn't there -- you'd expect to see long runs of `thread 1` and long runs of `thread 2`, not alternating `thread 1 thread 2 thread 1 thread 2`, because thread 2 will only get control when it tries to lock during the exact instant thread1 has released it, and vice versa -- so once a thread has obtained the lock it will tend to re-lock it repeatedly. The `sleep` just makes thread 2 *vastly* less likely to try the lock at the right time. — trent, Jul 24 '19 at 15:14
And by "thread 2 tries to lock during the exact instant thread 1 has released it", I mean of course that the OS just happened to decide to suspend thread 1 and start running thread 2 after thread 1 unlocked the mutex and before it locked it. Solomon Slow's answer elaborates on this. — trent, Jul 24 '19 at 15:23
@SanderDeDycker I test it with adding sched_yield after thread1 releasing the lock, but it doesn't work and the result is also just printing thread1 on and on. — user3015856, Jul 24 '19 at 15:25
`sched_yield` does not guarantee that another thread will be run. If the OS determines that running thread 2 is a waste of resources, *perhaps because all thread 2 ever does is wait for a lock that it never obtains*, then thread 2 may be considered low priority and thread 1 will just be woken up again. `sched_yield` should be considered a hint to the OS, not a means of synchronization. — trent, Jul 24 '19 at 15:28
@trentcl : true, but assuming the scheduler is the default `SCHED_OTHER` and the niceness wasn't manually adjusted, only contrived examples would not actually yield. Admittedly, this might be such a contrived example due to the sleep *inside* the critical section, which is hopefully something no-one ever actually does. — Sander De Dycker, Jul 25 '19 at 06:53
@Sander I assume you are correct; I wouldn't normally get into this level of detail so I'm unfamiliar with how all the schedulers work. However, I also see that [sched_yield() is intended for use with real-time scheduling policies (i.e., SCHED_FIFO or SCHED_RR). Use of sched_yield() with nondeterministic scheduling policies such as SCHED_OTHER is unspecified and very likely means your application design is broken.](http://man7.org/linux/man-pages/man2/sched_yield.2.html) — trent, Jul 25 '19 at 10:44

score 2 · Answer 1 · answered Jul 22 '19 at 13:47

2

There's a name for that problem. It's called starvation, and it often can be a problem when any thread keeps a mutex locked for an extended period of time. Each of the two threads in your example tries to keep the mutex locked all the time. (Yeah, they unlock the mutex each time around the loop, but then they re-lock it in the very next instruction.)

The best way to fix starvation is, don't keep the mutex locked any longer than necessary. One of the threads in your example calls sleep() while the mutex is locked. That is virtually always a bad idea in any real program.

Another way to fix it is to use a so-called "fair" mutex. Ownership of a fair mutex always is awarded to the thread that's been waiting the longest. It's less efficient than the regular kind of mutex, but it can be an easy fix for some starvation problems. Unfortunately, there may not be any standard way (or any way at all) to get a fair mutex on any given platform.

FYI: What happens in your example is, Thread 1 wakes up from the sleep() call and releases the mutex. As soon as the mutex is released, it's a race between the two threads to see which one will get it next. Unfortunately, thread 2 was blocked awaiting the mutex when the gun went off. The OS moves thread 2 to a "run queue" where it must wait for a CPU to run on. Meanwhile, thread 1 is already running. Thread 1 grabs the mutex again, then thread 2 wakes up soon after, finds the mutex locked, and goes back to waiting.

answered Jul 22 '19 at 13:47

Solomon Slow

25,130
5
37
57

If remove sleep from thread1, then two threads will be waked up more fairly; even if thread1 acquires lock firstly, thread2 can still acquire the lock later. Comparing the version with sleep, thread2 can't acquire the lock at all under this condition. How to explain it? – user3015856 Jul 22 '19 at 14:57
@user3015856 The `sleep()` isn't the real problem. The real problem is that the _very next thing_ that either thread does after releasing the mutex is, it locks the mutex again. That gives the other thread no real chance to compete because acquiring an un-fair mutex doesn't just happen: The thread itself has to execute instructions to _make_ it happen. But the "starved" thread isn't running at that moment, and the one that just released the mutex already is running, so the one that just released the mutex always wins. – Solomon Slow Jul 22 '19 at 17:11
Thanks for you explanations. But It still can't explain that if remove sleep from thread1, why can't thread1 win on and on after thread1 wins at one time(as you said, thread1 will lock again at once after releasing the lock)? – user3015856 Jul 24 '19 at 15:11
1

@user3015856 Thread 2 is _blocked_ while thread 1 is running. "Blocked" means, that the operating system has to do real work to get it running again. When thread 1 releases the lock, the OS does _not_ immediately do that work. All it does is make a note of the fact that thread 2 now is "runnable." Meanwhile, thread 1 keeps running. The very next thing thread 1 does is, it locks the lock again. _Eventually_ the OS completes the task of waking up thread 2, but when it wakes up, it finds that the lock is locked again. There's nothing else it can do at that point except go back to waiting. – Solomon Slow Jul 24 '19 at 15:20
If remove sleep from thread1, the result may be like this: thread1(repeat 20 times), thread2(repeat 10 times), thread1(repeat 15 times), ... But with sleep in thread1, the result is thread2 never acquire the lock, which it's extremely weird in my view. Maybe the comment from @trentcl is useful: The sleep just makes thread 2 vastly less likely to try the lock at the right time – user3015856 Jul 24 '19 at 15:53

prog-fh · Answer 2 · 2019-07-22T09:59:19.400

1

pthread_mutex_t does not guaranty that the threads blocked on a lock() operation are in a First-In-First-Out wait queue.
As soon as thread1 unlocks the mutex, it goes on the next iteration and locks it again, which succeeds!

For the scheduler it would be more work to block thread1 and wake up thread2 than just letting thread1 run after the unlock() operation.

This question is related: Implementing a FIFO mutex in pthreads

edited Jul 22 '19 at 09:59

answered Jul 22 '19 at 09:41

prog-fh

13,492
1
15
30

Thanks for you pointing out the mistake that we often expect it like FIFO, and the implementation of a FIFO mutex is also very interesting. – user3015856 Jul 24 '19 at 15:47

Why can't thread2 acquire the lock if thread1 acquires the lock repeatedly?

2 Answers2