How can I find out the source of this glibc backtrace originating with clone()?

Question

This backtrace comes from a deadlock situation in a multi-threaded application. The other deadlocked threads are locking inside the call to malloc(), and appear to be waiting on this thread.

I don't understand what creates this thread, since it deadlocks before calling any functions in my application:

Thread 6 (Thread 0x7ff69d43a700 (LWP 14191)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a299460d in _L_lock_27 () from /usr/lib64/libc.so.6
#2  0x00007ff6a29945bd in arena_thread_freeres () from /usr/lib64/libc.so.6
#3  0x00007ff6a2994662 in __libc_thread_freeres () from /usr/lib64/libc.so.6
#4  0x00007ff6a3875e38 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

clone() is used to implement fork(), pthread_create(), and perhaps other functions. See here and here.

How can I find out if this trace comes from a fork(), pthread_create(), a signal handler, or something else? Do I just need to dig through the glibc code, or can I use gdb or some other tool? Why does this thread need the internal glibc lock? This would be useful in determining the cause of the deadlock.

Additional information and research:

malloc() is thread-safe, but not reentrant (recursive-safe) (see this and this, so malloc() is also not async-signal-safe. We don't define signal handlers for this process, so I know that we don't call malloc() from signal handlers. The deadlocked threads don't ever call recursive functions, and callbacks are handled in a new thread, so I don't think we should need to worry about reentrancy here. (Maybe I'm wrong?)

This deadlock happens when many callbacks are being spawned to signal (ultimately kill) different processes. The callbacks are spawned in their own threads.

Are we possibly using malloc in an unsafe way?

Possibly related:

glibc malloc internals

Malloc inside of signal handler causes deadlock.

How are signal handlers delivered in a multi-threaded application?

glibc fork/malloc deadlock bug that was fixed in glibc-2.17-162.el7. This looks similar, but is NOT my bug - I'm on a fixed version of glibc.

(I've been unsuccessful in creating a minimal, complete, verifiable example. Unfortunately the only way to reproduce is with the application (Slurm), and it's quite difficult to reproduce.)

EDIT: Here's the backtrace from all the threads. Thread 6 is the trace I originally posted. Thread 1 is just waiting on a pthread_join(). Threads 2-5 are locked after a call to malloc(). Thread 7 is listening for messages and spawning callbacks in new threads (threads 2-5). Those would be callbacks that would eventually signal other processes.

Thread 7 (Thread 0x7ff69e672700 (LWP 12650)):
#0  0x00007ff6a291aa3d in poll () from /usr/lib64/libc.so.6
#1  0x00007ff6a3c09064 in _poll_internal (shutdown_time=<optimized out>, nfds=2,
    pfds=0x7ff6980009f0) at ../../../../slurm/src/common/eio.c:364
#2  eio_handle_mainloop (eio=0xf1a970) at ../../../../slurm/src/common/eio.c:328
#3  0x000000000041ce78 in _msg_thr_internal (job_arg=0xf07760)
    at ../../../../../slurm/src/slurmd/slurmstepd/req.c:245
#4  0x00007ff6a3875e25 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 6 (Thread 0x7ff69d43a700 (LWP 14191)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a299460d in _L_lock_27 () from /usr/lib64/libc.so.6
#2  0x00007ff6a29945bd in arena_thread_freeres () from /usr/lib64/libc.so.6
#3  0x00007ff6a2994662 in __libc_thread_freeres () from /usr/lib64/libc.so.6
#4  0x00007ff6a3875e38 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 5 (Thread 0x7ff69e773700 (LWP 22471)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a28af7d8 in _L_lock_1579 () from /usr/lib64/libc.so.6
#2  0x00007ff6a28a7ca0 in arena_get2.isra.3 () from /usr/lib64/libc.so.6
#3  0x00007ff6a28ad0fe in malloc () from /usr/lib64/libc.so.6
#4  0x00007ff6a3c02e60 in slurm_xmalloc (size=size@entry=24, clear=clear@entry=false,
    file=file@entry=0x7ff6a3c1f1f0 "../../../../slurm/src/common/pack.c",
    line=line@entry=152, func=func@entry=0x7ff6a3c1f4a6 <__func__.7843> "init_buf")
    at ../../../../slurm/src/common/xmalloc.c:86
#5  0x00007ff6a3b2e5b7 in init_buf (size=16384)
    at ../../../../slurm/src/common/pack.c:152
#6  0x000000000041caab in _handle_accept (arg=0x0)
    at ../../../../../slurm/src/slurmd/slurmstepd/req.c:384
#7  0x00007ff6a3875e25 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 4 (Thread 0x7ff6a4086700 (LWP 5633)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a28af7d8 in _L_lock_1579 () from /usr/lib64/libc.so.6
#2  0x00007ff6a28a7ca0 in arena_get2.isra.3 () from /usr/lib64/libc.so.6
#3  0x00007ff6a28ad0fe in malloc () from /usr/lib64/libc.so.6
#4  0x00007ff6a3c02e60 in slurm_xmalloc (size=size@entry=24, clear=clear@entry=false,
    file=file@entry=0x7ff6a3c1f1f0 "../../../../slurm/src/common/pack.c",
    line=line@entry=152, func=func@entry=0x7ff6a3c1f4a6 <__func__.7843> "init_buf")
    at ../../../../slurm/src/common/xmalloc.c:86
#5  0x00007ff6a3b2e5b7 in init_buf (size=16384)
    at ../../../../slurm/src/common/pack.c:152
#6  0x000000000041caab in _handle_accept (arg=0x0)
    at ../../../../../slurm/src/slurmd/slurmstepd/req.c:384
#7  0x00007ff6a3875e25 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 3 (Thread 0x7ff69d53b700 (LWP 12963)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a28af7d8 in _L_lock_1579 () from /usr/lib64/libc.so.6
#2  0x00007ff6a28a7ca0 in arena_get2.isra.3 () from /usr/lib64/libc.so.6
#3  0x00007ff6a28ad0fe in malloc () from /usr/lib64/libc.so.6
#4  0x00007ff6a3c02e60 in slurm_xmalloc (size=size@entry=24, clear=clear@entry=false,
    file=file@entry=0x7ff6a3c1f1f0 "../../../../slurm/src/common/pack.c",
    line=line@entry=152, func=func@entry=0x7ff6a3c1f4a6 <__func__.7843> "init_buf")
    at ../../../../slurm/src/common/xmalloc.c:86
#5  0x00007ff6a3b2e5b7 in init_buf (size=16384)
    at ../../../../slurm/src/common/pack.c:152
#6  0x000000000041caab in _handle_accept (arg=0x0)
    at ../../../../../slurm/src/slurmd/slurmstepd/req.c:384
#7  0x00007ff6a3875e25 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 2 (Thread 0x7ff69f182700 (LWP 19734)):
#0  0x00007ff6a2932eec in __lll_lock_wait_private () from /usr/lib64/libc.so.6
#1  0x00007ff6a28af7d8 in _L_lock_1579 () from /usr/lib64/libc.so.6
#2  0x00007ff6a28a7ca0 in arena_get2.isra.3 () from /usr/lib64/libc.so.6
#3  0x00007ff6a28ad0fe in malloc () from /usr/lib64/libc.so.6
#4  0x00007ff6a3c02e60 in slurm_xmalloc (size=size@entry=24, clear=clear@entry=false,
    file=file@entry=0x7ff6a3c1f1f0 "../../../../slurm/src/common/pack.c",
    line=line@entry=152, func=func@entry=0x7ff6a3c1f4a6 <__func__.7843> "init_buf")
    at ../../../../slurm/src/common/xmalloc.c:86
#5  0x00007ff6a3b2e5b7 in init_buf (size=16384)
    at ../../../../slurm/src/common/pack.c:152
#6  0x000000000041caab in _handle_accept (arg=0x0)
    at ../../../../../slurm/src/slurmd/slurmstepd/req.c:384
#7  0x00007ff6a3875e25 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ff6a292534d in clone () from /usr/lib64/libc.so.6

Thread 1 (Thread 0x7ff6a4088880 (LWP 12616)):
#0  0x00007ff6a3876f57 in pthread_join () from /usr/lib64/libpthread.so.0
#1  0x000000000041084a in _wait_for_io (job=0xf07760)
    at ../../../../../slurm/src/slurmd/slurmstepd/mgr.c:2219
#2  job_manager (job=job@entry=0xf07760)
    at ../../../../../slurm/src/slurmd/slurmstepd/mgr.c:1397
#3  0x000000000040ca07 in main (argc=1, argv=0x7fffacab93d8)
    at ../../../../../slurm/src/slurmd/slurmstepd/slurmstepd.c:172

If off topic, may I ask where the appropriate place to ask this question is? I thought it was about programming, but I'm open to suggestions. — mgarey, Apr 06 '18 at 22:47
Debugging this kinds of problems is probably some of the most difficult problems you can debug - I'm not sure there's enough information here that anyone can use to provide any pinpointed help - debugging this likely requires great insigth into how your application works. (You hint about fork() and killing processes, so if you call fork() in a multithreaded program, and you do anything other than call exec() in your child process, that could be one source of this deadlock though). — nos, Apr 06 '18 at 22:56
@nos Yes, I've been working on this particular problem for a couple months (off and on). I did read that about fork in a multithreaded program. We have many processes in our application, but this particular process that is deadlocking only ever forks/execs something once. I don't believe that is the source of the issue. Mainly I'd like to know if there's a way to tell what that backtrace originate from - do I have to search through the glibc source code? Or can I cleverly use gdb in a way I don't know how just yet? — mgarey, Apr 06 '18 at 22:59
Notice that if you don't know if it is coming from `fork` or `pthread_create`, you're doing it wrong. You cannot **ever** safely fork a threaded program except if directly followed by `exec`/`exit`. — Antti Haapala -- Слава Україні, Apr 06 '18 at 23:02
The backtrace appears to be from a pthread_create call. It just says your thread function is stuck waiting for the lock. Your code may not have any signal handler, but Slurm might have one? If you can strace the whole run under Slurm, you might be able to see if Slurm or something installs a signal handler. — P.P, Apr 06 '18 at 23:04
Are ```fork``` and ```pthread_create``` the only two possible functions that could cause this kind of backtrace? I'm pretty confident it's not a ```fork``` but wanted to make sure. — mgarey, Apr 06 '18 at 23:05
"My application" is Slurm. i.e., Slurm is deadlocking, not the application that Slurm is running, and I am trying to fix the bug in Slurm. Sorry about the confusion. I'm certain that Slurm doesn't install a signal handler in this process (the slurmstepd). Slurm does install signal handlers in other daemons. — mgarey, Apr 06 '18 at 23:06
This particular stack trace calls start_thread(), so that's originating from pthread_create. I don't think there's many clever ways to use gdb to find the origin - you can inspect the process id if you arn't sure if the stacktrace comes from the process you're debugging or if you're in a child process (which would be the result of fork() ) Note that the stacktrace we see here could just be some innocent thread that gets deadlocked, the actual source of the issue could likely be somewhere else (i.e. the one that's holding the lock that this thread is blocked in) — nos, Apr 06 '18 at 23:07
I'll edit the post and add the backtrace from the other threads. I meant to do that in the original post but forgot. — mgarey, Apr 06 '18 at 23:10
Probably useful to determine which thread holds the mutex everybody is waiting on (https://stackoverflow.com/q/3483094/) — Nemo, Apr 06 '18 at 23:18
slurm seems to call fork() many places, and it does lots of stuff in the child processes that will cause issues if slurm is a multithreaded program. (e.g. src/common/env.c , or src/slurmd/slurmstepd/step_terminate_monitor.c seems to call setenv() that could spell disaster, and container_g_add_pid() is also called in the clild process which does all kinds of potential problematic calls - though afaik this will affect and cause potential deadlocks in the child processes - not the parent). — nos, Apr 06 '18 at 23:20
@Nemo Wow, that's seems incredibly useful. I didn't know that was possible. I'll definitely try that. — mgarey, Apr 06 '18 at 23:21
Also perhaps of value is that your Thread 6 is a thread that's about to exit. It has already run your code/function, and that function has finished, causing the thread to terminate - and it's deadlock in the internal cleanup handlers of glibc/pthreads. — nos, Apr 06 '18 at 23:34
@nos Yes, I've noticed. Unfortunately we've inherited code that's been developed ever since 2002, some of which is potentially dangerous, as you point out. It looks like step_terminate_monitor() is called in the parent. The child (extern) just sleeps. (src/slurmd/slurmstepd/mgr.c _spawn_job_container()). Is that bad even if called in the parent and the child sleeps? Great eyes, by the way. I've been looking at this for so long I've gotten lost in some of the details. — mgarey, Apr 06 '18 at 23:35
Thread 6 is about to exit? That's very interesting. I (naively) thought it was just starting. — mgarey, Apr 06 '18 at 23:35
Any suggestions on a better title? I really tried hard not to make this a "plz debug for me" question, and was trying to search for additional tools to debug, but I really appreciate the suggestions and info y'all are giving. — mgarey, Apr 06 '18 at 23:37
@Nemo I tried figuring out which thread holds the mutex. That SO answer, as well as [this](https://en.wikibooks.org/wiki/Linux_Applications_Debugging_Techniques/Deadlocks) page that talk about how to do this, all use the ```pthread_mutex_lock``` frame. I don't have a frame that calls ```pthread_mutex_lock``` - I can look at the _l_lock_1579 frame and the registers, but I don't know which one is the lock. How do I tell what register (or other) holds the lock? Should I make this a separate question instead? — mgarey, Apr 11 '18 at 00:12
@mgarey: Those low-level locks don't record the owner. You could probably create a patched version of glibc that does so, though. — caf, Apr 11 '18 at 14:00

score 4 · Accepted Answer · answered Apr 09 '18 at 03:20

4

The presence of start_thread() in the backtrace indicates that this is a pthread_create() thread.

__libc_thread_freeres() is a function that glibc calls at thread exit, which invokes a set of callbacks to free internal per-thread state. This indicates the thread you have highlighted is in the process of exiting.

arena_thread_freeres() is one of those callbacks. It is for the malloc arena allocator, and it moves the free list from the exiting thread's private arena to the global free list. To do this, it must take a lock that protects the global free list (this is the list_lock in arena.c).

It appears to be this lock that the highlighted thread (Thread 6) is blocked on.

The arena allocator installs pthread_atfork() handlers which lock the list lock at the start of fork() processing, and unlock it at the end. This means that while other pthread_atfork() handlers are running, all other threads will block on this lock.

Are you installing your own pthread_atfork() handlers? It seems likely that one of these may be causing your deadlock.

answered Apr 09 '18 at 03:20

caf

233,326
40
323
462

Excellent answer. Thanks for all the information about each of those internal glibc functions, and what it means for my application. Slurm does have a number of ```pthread_atfork()``` handlers. I don't recall if this process uses them, but I suspect it does. I'll check that. – mgarey Apr 09 '18 at 15:19
I've accepted this answer. It went above and beyond answering my original question and also helped me make progress in debugging the program. Even after taking all ```fork()``` calls out of the process (and also all ```pthread_atfork()``` handlers, the deadlock still happened (though less often). I'm only able to reproduce this on CentOS 7 glibc 2.17-*. I can't reproduce on Ubuntu 16.04 glibc 2.23, for example, even with the ```fork()``` call left in the process. I suspect a glibc bug. – mgarey Apr 23 '18 at 23:24
It turns out this bug was caused by pthread_cancel using asynchronous cancel type. Essentially, we believe that pthread_cancel was cancelling a thread while it was exiting, i.e. while it was holding the arena locks. Thus, all the other threads deadlock on the arena lock either when they call malloc or when they exit, since it is held by a thread that no longer exists. – mgarey May 02 '18 at 16:15
It is interesting that we couldn’t reproduce this on Debian. Something about RedHat’s patches to glibc 2.17 seems to have made this race condition much more likely. So, not a glibc bug. It's interesting because that asynchronous cancel code has been around since 2002. None of us have any idea why it was put there in the first place, but when we finally found it we knew it was the likely culprit. Getting rid of pthread_cancel and replacing it with good use of mutexes, pthread_cond_wait, and pthread_cond_signal fixes the problem. – mgarey May 02 '18 at 16:17
I agree with that approach. Thread cancellation - just say no. – caf May 03 '18 at 00:00

How can I find out the source of this glibc backtrace originating with clone()?

1 Answers1