Currently I'm looking at a situation where an application gets stuck while two distinct instances of std::unique_lock
in two separate threads simultaneously "own" a lock on the same std::mutex
. Owning in this case means that owns_lock()
returns true
for both locks. I'm trying to make sense of this situation, which as far as I understand should not be possible. Upon inspection with gdb I found that while _M_owns
property is true
for both locks, simultaneously the __lock
, __count
and __owner
properties of the associated std::mutex
are zero. To give a clear picture I'll go through the situation with gdb step-by-step:
We have two threads, thread 1 and thread 33.
(gdb) info thread
Id Target Id Frame
* 1 Thread 0x7ffff7e897c0 (LWP 66233) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
33 Thread 0x7fffe5ee0640 (LWP 66269) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
Thread 1 and thread 33 each have their own instance l
of std::unique_lock
(0x7fffffffd240
, 0x7fffe5edf8c0
) and access to class members vec
(std::vector
, 0x612000002fb0
), m (std::mutex
, 0x612000002ed8
), cv
(std::condition_variable
, 0x612000002f00
) which are shared between both threads.
(gdb) select 5
(gdb) p &m
$14 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$15 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$16 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$17 = (std::unique_lock<std::mutex> *) 0x7fffffffd240
(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p &m
$18 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$19 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$20 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$21 = (std::unique_lock<std::mutex> *) 0x7fffe5edf8c0
Thread 1 is stuck on this line:
94 cv.wait(l);
and should proceed once it acquired the lock. Thread 2 is stuck on this line:
54 cv.wait(l);
and also should proceed once it acquired the lock. So let's take a look at who currently owns the lock starting with thread 1.
(gdb) t 1
[Switching to thread 1 (Thread 0x7ffff7e897c0 (LWP 67445))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: No such file or directory.
(gdb) select 5
(gdb) p l
$22 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$23 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}
Note how l._M_owns = true
while the std::mutex m
, despite having two __nusers
, does not seem to be locked.
Now looking at thread 33 we see the same situation:
(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p l
$24 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$25 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}
Can anybody explain how both instances of std::unique_lock
can simultaneously "own" the lock (i.e. l.owns_lock() == true
) while the associated std::mutex
in fact seems to be not locked at all?