2

Here is the sad story:

  1. I built an executable thedarkmod.x64 on my machine, with debug symbols moved off to a separate thedarkmod.x64.debug file.
  2. A user experienced a crash on this executable, while running it under gdb. She saved core dump core.1600 using generate-core-file gdb command.
  3. I downloaded this core file and opened it by starting gdb ./thedarkmod.x64 core.1600.
  4. I switch between threads and run bt command, but I see some trash instead of proper stack traces.

Note: the user has my thedarkmod.x64.debug file, and when she runs bt just before saving the core dump, she sees meaningful stack trace.


When I start gdb on core dump, I see many warning messages like:

  • Unable to find libthread_db matching inferior's thread library, thread debugging will not be available
  • warning: .dynamic section for "libXXX.so" is not at the expected address (wrong library or version mismatch?)

According to this question, the first warning seems to imply that unless I have the same version of libthread_db.so.1 as the one on the machine where core dump was saved, I cannot see anything useful in multithreaded program. I asked the user to find this file and give it to me, but it did not help. Then I asked to provide libpthread.so.0 file too, and after some struggling with set solib-search-path, set sysroot, set auto-load safe-path, and set libthread-db-search-path I managed to get this warning replaced with "Thread debugging using libthread_db enabled", but stack traces were still wrong.

So the questions are:

  1. Is there a way to properly inspect core dumps generated on Linux machine with vastly different environment? I mean different kernel, pthreads, glibc, etc. Is there anywhere detailed instructions on how to achieve that?
  2. Is there any gdb command like generate-core-file includecode, which would embed all the code including the necessary .so libs into the core file, so that I could open it on my machine without additional hassle?

At the current moment, I have to admit that Linux core dumps are pretty useless to me, since I am not ready to create a new Linux VM for every core dump submitted.


UPDATE: I managed to obtain proper stack trace.

  1. The solution provided in the "duplicate question" did not work for me. No solib-related setting was enough, only set sysroot helped.
  2. In my particular case, the stack trace ended inside free function of libc. Gdb could not walk through its call stack, supposedly because it was compiled with -fomit-frame-pointer just like most libraries. Making sure gdb loads libc.so.6 obtained from the user's machine helped.

Here is the full list of gdb commands which I use (each of them is necessary to make things work, except for bt of course):

# note: all the .so files obtained from user machine must be put into local directory.
#
# most importantly, the following files are necessary:
#   1. libthread_db.so.1 and libpthread.so.0: required for thread debugging.
#   2. other .so files are required if they occur in call stack.
#
# these files must also be renamed exactly as the symlinks
# i.e. libpthread-2.28.so should be renamed to libpthread.so.0

# load executable file
file ./thedarkmod.x64

# force gdb to forget about local system!
# load all .so files using local directory as root
set sysroot .

# drop dump-recorded paths to .so files
# i.e. load ./libpthread.so.0 instead of ./lib/x86_64-linux-gnu/libpthread.so.0
set solib-search-path .
# disable damn security protection
set auto-load safe-path /

# load core dump file
core core.6487

# print stacktrace
bt
stgatilov
  • 5,333
  • 31
  • 54
  • You probably need the same version of all shared libraries the program uses, not just libthread_db.so. – ssbssa Sep 15 '21 at 10:34
  • @ssbssa, I don't understand why I need them if I don't want to debug them. I only want to see stacktrace of each thread with proper names of functions from my executable, I don't care if library functions will be shown as meaningless addresses. – stgatilov Sep 15 '21 at 12:38
  • But it will not be able to get the correct return address of any functions of these libraries, so the stacktrace from them on will not make much sense. – ssbssa Sep 15 '21 at 15:24
  • @ssbssa, the concept of frame pointers allows to fetch call stack without having code or debug symbols. That's how things work on normal platforms, and I seriously doubt Linux x64 is any different in this regard. – stgatilov Sep 15 '21 at 16:27
  • So you built this application with `-fno-omit-frame-pointer`? – ssbssa Sep 15 '21 at 16:37
  • @ssbssa, Yes, `-fno-omit-frame-pointer` is included in compilation options of the application. Do you suppose that the glibc was compiled with -fomit-frame-pointers, so I cannot see call stack because the crash is in malloc, and the user can see it because she has debug symbols of that version of glibc?... – stgatilov Sep 15 '21 at 16:50
  • Yes, I think that's very likely. – ssbssa Sep 15 '21 at 16:52
  • @ssbssa, your idea was correct. By taking libc.so.6 from user and more f_cking with gdb commands, I managed to get proper stacktrace. – stgatilov Sep 22 '21 at 04:33

0 Answers0