0

Currently I'm looking at a situation where an application gets stuck while two distinct instances of std::unique_lock in two separate threads simultaneously "own" a lock on the same std::mutex. Owning in this case means that owns_lock() returns true for both locks. I'm trying to make sense of this situation, which as far as I understand should not be possible. Upon inspection with gdb I found that while _M_owns property is true for both locks, simultaneously the __lock, __count and __owner properties of the associated std::mutex are zero. To give a clear picture I'll go through the situation with gdb step-by-step:

We have two threads, thread 1 and thread 33.

(gdb) info thread
  Id   Target Id                                         Frame 
* 1    Thread 0x7ffff7e897c0 (LWP 66233) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
  33   Thread 0x7fffe5ee0640 (LWP 66269) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57

Thread 1 and thread 33 each have their own instance l of std::unique_lock (0x7fffffffd240, 0x7fffe5edf8c0) and access to class members vec (std::vector, 0x612000002fb0), m (std::mutex, 0x612000002ed8), cv (std::condition_variable, 0x612000002f00) which are shared between both threads.

(gdb) select 5
(gdb) p &m
$14 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$15 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$16 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$17 = (std::unique_lock<std::mutex> *) 0x7fffffffd240
(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57      in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p &m
$18 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$19 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$20 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$21 = (std::unique_lock<std::mutex> *) 0x7fffe5edf8c0

Thread 1 is stuck on this line:

94                      cv.wait(l);

and should proceed once it acquired the lock. Thread 2 is stuck on this line:

54                      cv.wait(l);

and also should proceed once it acquired the lock. So let's take a look at who currently owns the lock starting with thread 1.

(gdb) t 1
[Switching to thread 1 (Thread 0x7ffff7e897c0 (LWP 67445))]
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57      ./nptl/futex-internal.c: No such file or directory.
(gdb) select 5
(gdb) p l
$22 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$23 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}

Note how l._M_owns = true while the std::mutex m, despite having two __nusers, does not seem to be locked.

Now looking at thread 33 we see the same situation:

(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57      in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p l
$24 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$25 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}

Can anybody explain how both instances of std::unique_lock can simultaneously "own" the lock (i.e. l.owns_lock() == true) while the associated std::mutex in fact seems to be not locked at all?

norritt
  • 345
  • 4
  • 16
  • 4
    It's probably undefined to call `owns_lock` inside `cv.wait` since this isn't something your program can do, only something you can do in the debugger. It makes sense that you might get a nonsensical answer from this. Both threads are probably waiting for the condition variable and the mutex is probably not locked. – user253751 May 09 '23 at 12:46

1 Answers1

2

I expect that the _M_owns variable is simply not updated during cv.wait because that would be a waste of time because the program can't see it get updated because it's waiting. The standard library isn't required to help you cheat with the debugger.

With that in mind, it seems probable that both threads are waiting for the condition variable and the mutex is not locked.

user253751
  • 57,427
  • 7
  • 48
  • 90
  • Thank you for the reply, I'll accept this answer as correct unless someone can provide evidence that something else is going on here. Assuming that the explanation is correct and we can not count on `owns_lock` to provide current information this raises the question how we can accurately determine who currently owns a lock. Would you agree with the statement that inspecting the properties `__lock` and `__owner` of `std::mutex` will always provide up-to-date information on the state of a mutex and the associated locks consequently? – norritt May 09 '23 at 15:07