4

My OS is RHEL 7, and I run a simple Go program:

package main

import (
    "time"
)

func main() {
    time.Sleep(1000 * time.Second)
}

During its running, I check the thread count of process:

# cat /proc/13858/status | grep Thread
Threads:        5

While using the pstack command shipped on RHEL, it only prints one thread's stack:

# pstack 13858
Thread 1 (process 13858):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:307
#1  0x0000000000422580 in runtime.futexsleep (addr=0x4c7af8 <runtime.timers+24>, val=0, ns=999999997446) at /usr/local/go/src/runtime/os1_linux.go:57
#2  0x000000000040b07b in runtime.notetsleep_internal (n=0x4c7af8 <runtime.timers+24>, ns=999999997446, ~r2=255) at /usr/local/go/src/runtime/lock_futex.go:174
#3  0x000000000040b1e6 in runtime.notetsleepg (n=0x4c7af8 <runtime.timers+24>, ns=999999997446, ~r2=false) at /usr/local/go/src/runtime/lock_futex.go:206
#4  0x000000000043e5de in runtime.timerproc () at /usr/local/go/src/runtime/time.go:209
#5  0x0000000000451001 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1998
#6  0x0000000000000000 in ?? ()

Why does pstack only print one thread's content?

P.S.: The pstack script is here:

#!/bin/sh

if test $# -ne 1; then
    echo "Usage: `basename $0 .sh` <process-id>" 1>&2
    exit 1
fi

if test ! -r /proc/$1; then
    echo "Process $1 not found." 1>&2
    exit 1
fi

# GDB doesn't allow "thread apply all bt" when the process isn't
# threaded; need to peek at the process to determine if that or the
# simpler "bt" should be used.

backtrace="bt"
if test -d /proc/$1/task ; then
    # Newer kernel; has a task/ directory.
    if test `/bin/ls /proc/$1/task | /usr/bin/wc -l` -gt 1 2>/dev/null ; then
        backtrace="thread apply all bt"
    fi
elif test -f /proc/$1/maps ; then
    # Older kernel; go by it loading libpthread.
    if /bin/grep -e libpthread /proc/$1/maps > /dev/null 2>&1 ; then
        backtrace="thread apply all bt"
    fi
fi

GDB=${GDB:-/usr/bin/gdb}

# Run GDB, strip out unwanted noise.
# --readnever is no longer used since .gdb_index is now in use.
$GDB --quiet -nx $GDBARGS /proc/$1/exe $1 <<EOF 2>&1 |
set width 0
set height 0
set pagination no
$backtrace
EOF
/bin/sed -n \
    -e 's/^\((gdb) \)*//' \
    -e '/^#/p' \
    -e '/^Thread/p' 
Nan Xiao
  • 16,671
  • 18
  • 103
  • 164

2 Answers2

4

pstack uses gdb. This is a quote from golang doc (https://golang.org/doc/gdb):

GDB does not understand Go programs well. The stack management, threading, and runtime contain aspects that differ enough from the execution model GDB expects that they can confuse the debugger, even when the program is compiled with gccgo. As a consequence, although GDB can be useful in some situations, it is not a reliable debugger for Go programs, particularly heavily concurrent ones.

4 out of 5 threads that you see in /proc are created even before you program enters main. I assume that golang runtime creates them.

Why does pstack only print one thread's content?

Judging from output of strace for gdb I see that gdb actually tries to attach to them but after something goes wrong and gdb does not try to inspect these threads. These are syscalls that gdb issued for these runtime threads but due to unknown reason decided stopped investigating them immediately:

5072  ptrace(PTRACE_ATTACH, 5023, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0
5072  ptrace(PTRACE_ATTACH, 5024, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0
5072  ptrace(PTRACE_ATTACH, 5025, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0

However you can inspect them yourself. It seems that these threads belong to golang runtime

$ pstack 5094
Thread 1 (process 5094):
#0  0x0000000000459243 in runtime.futex ()
#1  0x00000000004271e0 in runtime.futexsleep ()
#2  0x000000000040d55b in runtime.notetsleep_internal ()
#3  0x000000000040d64b in runtime.notetsleep ()
#4  0x0000000000435677 in runtime.sysmon ()
#5  0x000000000042e6cc in runtime.mstart1 ()
#6  0x000000000042e5d2 in runtime.mstart ()
#7  0x00000000004592b7 in runtime.clone ()
#8  0x0000000000000000 in ?? ()

$ pstack 5095
Thread 1 (process 5095):
#0  0x0000000000459243 in runtime.futex ()
#1  0x0000000000427143 in runtime.futexsleep ()
#2  0x000000000040d3f4 in runtime.notesleep ()
#3  0x000000000042f6eb in runtime.stopm ()
#4  0x0000000000430a79 in runtime.findrunnable ()
#5  0x00000000004310ff in runtime.schedule ()
#6  0x000000000043139b in runtime.park_m ()
#7  0x0000000000455acb in runtime.mcall ()
#8  0x000000c820021500 in ?? ()
#9  0x0000000000000000 in ?? ()

$ pstack 5096
Thread 1 (process 5096):
#0  0x0000000000459243 in runtime.futex ()
#1  0x0000000000427143 in runtime.futexsleep ()
#2  0x000000000040d3f4 in runtime.notesleep ()
#3  0x000000000042f6eb in runtime.stopm ()
#4  0x000000000042fff7 in runtime.startlockedm ()
#5  0x0000000000431147 in runtime.schedule ()
#6  0x000000000043139b in runtime.park_m ()
#7  0x0000000000455acb in runtime.mcall ()
#8  0x000000c820020000 in ?? ()

Update for gdb 8.0

pstack that uses gdb 8.0 correctly prints backtraces for all threas. The command looks like:

$ GDB=$HOME/bin/gdb pstack  $(pidof main)

And here is its output (shortened):

$ GDB=$HOME/bin/gdb pstack  $(pidof main) | egrep "^Thread"
Thread 4 (LWP 18335):
Thread 3 (LWP 18334):
Thread 2 (LWP 18333):
Thread 1 (LWP 18332):
  • Thanks very much for your answer! If possible, could u elaborate how to use `strace` to check `gdb`? Thx! – Nan Xiao May 27 '16 at 11:11
  • This way I see that gdb does: `strace -f -v -o gdb.txt gdb --batch -ex "thread apply all bt" test_pid `. test_pid is a golang program. By the way, Tom Tromey who is a developer of gdb answers questions on this site. So possibly wait for an answer from him –  May 27 '16 at 11:14
0

When you're passing LWP/thread id to pstack you get a stack of that thread only. Try to pass a PID of the process to pstack and you'll get stacks of all its threads. You may get a PID or Tgid (thread group id) of the process: cat /proc/13858/status | grep Tgid. To get all LWPs created by your process you may run ps -L <PID>

Mikhail Volskiy
  • 209
  • 2
  • 5
  • Yes, `13858` is just the PID, so I am very confused. Thx! – Nan Xiao May 27 '16 at 05:09
  • Yes, when GDB is started with a given PID of your Go application it somehow gets confused and `info threads` doesn't provide information about other threads. So your question actually about the issue with GDB and needs further investigation. – Mikhail Volskiy May 27 '16 at 05:52
  • @Mikhail Volskiy, Mikhail, nice to me you here. We used to work together in PS –  May 27 '16 at 14:10
  • @ Sergei Kurenkov, Me too, actually it's a small world. – Mikhail Volskiy May 27 '16 at 14:24