I am having problems debugging a multi-threaded C++ application on an ARMv7 targets. The issue shows up on two different ARM targets, and I use different toolchains for them:
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I've checked some threads, but (due to having the same issue with a minimalistic multithreading program) it seems that I * don't have a corrupted stack * any issues with virtual functions or function pointers
Mostly I'm using the target Toradex Colibri iMX6 which has an Angstrom Linux 2016.12 running on it.
Questions
- Is there something wrong with how I build the program?
- is there sth. wrong with how I'm using the
gdbserver
/gdb
? - which options do I have to fix the debugger output?
I debug via gdbserver
on the target and the toolchain's arm-linux-gnueabihf-gdb
on my host.
There's no native gdb
for any of the targets.
I can build the application for Linux x86, but can't reproduce the bug so far on the PC.
SW-problem
It seems that two of the threads are getting stuck, maybe due to a deadlock of two mutexes, or a thread trying to get one mutex a second time (although that seems unlikely, the bug showed up after I've configured a mutex as recursive; I'll have to check for a second mutex used in that thread).
All other threads seem to keep running fine.
SW-build and debug configuration
Build settings:
I'm using a toolchain provided by Toradex with arm-linux-gnueabihf-g++
and
-std=c++11 -Wall -Werror -Wextra -Wno-unused-result -Winit-self -Wmissing-include-dirs -Wpointer-arith -Wno-format-security -Wno-implicit-fallthrough -Wl,-Map=output.map -ggdb -g3 -fno-inline -O0
I pass the same program to the debuggers (i.e. to gdbserver
on the target and to arm-linux-gnueabihf-gdb
on the host)
$ (gdb) set sysroot </path/to/libs>
$ (gdb) file <binary>
$ (gdb) target remote IP:port
shared libraries:
For shared libraries, I've copied the /usr/lib
and /lib
from the target to the host. I've then downloaded the debug libraries which are available for the target/distribution and replaced the original shared libs with those.
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x76fcf800 0x76feaa70 Yes /path/to/libs/lib/ld-linux-armhf.so.3
0x76fb9700 0x76fbcd2c Yes /path/to/libs/lib/librt.so.1
0x76f940c0 0x76fa2e0c Yes /path/to/libs/lib/libpthread.so.0
0x76f01630 0x76f72a10 Yes (*) /path/to/libs/usr/lib/libstdc++.so.6
0x76e14d38 0x76e48028 Yes /path/to/libs/lib/libm.so.6
0x76e041b0 0x76e0e7ec Yes /path/to/libs/lib/libgcc_s.so.1
0x76cd1000 0x76dc2b10 Yes /path/to/libs/lib/libc.so.6
0x7449c96c 0x744a29e4 Yes /path/to/libs/lib/libnss_files.so.2
(*): Shared library is missing debugging information.
I could not find a debug library for libstdc++.so.6
.
Debugging results
Debug simple single-threaded application with crash on target:
- works, i.e. does not report the error message from above
Debug simple multi-threaded application, with or without deadlock, on target:
(gdb) bt
#0 0x76d6cd44 in uname () at ../sysdeps/unix/syscall-template.S:84
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Debug the same simple multi-threaded application, with or without deadlock, on Linux-x86:
- works
Debug buggy application on PC:
- seems to work, but we cannot reproduce the bug so far
Debug the affected application on target:
Thread 1 received signal SIGINT, Interrupt.
0x76f9facc in __lll_robust_lock_wait (futex=0x257b94 <namespace1::function()::su_place+20>, private=0)
at /usr/src/debug/glibc/2.24-r0/git/nptl/lowlevelrobustlock.c:46
46 /usr/src/debug/glibc/2.24-r0/git/nptl/lowlevelrobustlock.c: No such file or directory.
(gdb) thread apply all bt
Thread 6 (Thread 6606.6630):
#0 0x76d832c8 in __setreuid (ruid=8, euid=0)
at /usr/src/debug/glibc/2.24-r0/git/sysdeps/unix/sysv/linux/i386/setreuid.c:29
#1 0x7efff06c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 5 (Thread 6606.6629):
#0 0x76d55d44 in uname () at ../sysdeps/unix/syscall-template.S:84
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 4 (Thread 6606.6628):
#0 0x76d55d44 in uname () at ../sysdeps/unix/syscall-template.S:84
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 3 (Thread 6606.6627):
#0 0x76d55d44 in uname () at ../sysdeps/unix/syscall-template.S:84
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 2 (Thread 6606.6626):
#0 __lll_robust_lock_wait (
futex=0x25b950 <namespace_2::a_function()::a_static_member+152>, private=128)
at /usr/src/debug/glibc/2.24-r0/git/nptl/lowlevelrobustlock.c:31
#1 0x00000080 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 1 (Thread 6606.6606):
#0 0x76f9facc in __lll_robust_lock_wait (futex=0x257b94 <namespace1::function()::su_place+20>,
private=0) at /usr/src/debug/glibc/2.24-r0/git/nptl/lowlevelrobustlock.c:46
#1 0x00000002 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Update
I could find the bug (mutex deadlock) using valgrind
with the PC-build of the SW.
However, the issue here is about the problems with gdb
, which I could not understand or solve yet.