0

Is there a possibility to backtrace a location where child process crashed in Linux using C/C++ code? What I want to do is the following:

  1. fork a new child process and retrieve it's PID
  2. wait for forked child process to crash ... probably using signal handler for SIGCHLD, or using waitpid()/waitid()
  3. retrieve stack trace of child at location where it crashed

This would make parent process act similar to debugger when attached proces crashes. You can assume that child process is compiled with debug symbols and parent process has root permissions.

What is the simplest way to achieve such functionality?

Lisur
  • 149
  • 1
  • 12
  • Possible duplicate of [How do I debug the child process after fork() in gdb?](https://stackoverflow.com/q/6199270/608639), [Debugging child process after fork (follow-fork-mode child configured)](https://stackoverflow.com/q/15126925/608639), [How can I switch between different processes fork() ed in gdb?](https://stackoverflow.com/q/6223786/608639), etc. – jww Sep 03 '18 at 11:41
  • Not a duplicate since your question uses debugger (gdb), and I try to achieve this without using external applications such as debugger. – Lisur Sep 03 '18 at 11:44
  • Then write that in your question ;) Any odds against simple printf debugging? – hellow Sep 03 '18 at 11:52
  • @hellow Edited post and added additional info that I seek for a programmatical solution in C. Also, I'm looking for something more systematical than printf debugging. To be more specific, I try to avoid it because it's a large project and I wan't to change as few lines of code as possible. – Lisur Sep 03 '18 at 12:05

1 Answers1

3

It is much simpler in Linux to use the libSegFault library provided as part of the GNU C library. On my system, it is installed in /lib/x86_64-linux-gnu/libSegFault.so.

All you need to do is to set SEGFAULT_SIGNALS environment variable to all (so you can catch all crash causes the library supports), optionally SEGFAULT_OUTPUT_NAME to point to the file the stack trace is written to (default is to standard error), and LD_PRELOAD to point to the segfault library. As long as the process does not modify these environment variables, they apply to all child processes as well.

For example, if ./yourprog was the program that forks a child that crashes, and you want the stack trace to ./yourprog.stacktrace, run

SEGFAULT_SIGNALS=all \
SEGFAULT_OUTPUT_NAME=./yourprog.stacktrace \
LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so \
  ./yourprog

or all in one line without the backslashes (\).

The only downside is that each crash overwrites the existing file, so you'll only see the latest one. If you have /proc mounted, then the crash dump includes both a backtrace and the memory map of the process at the crash moment.


If you insist on doing it in your own C program, I recommend you first take a look at the libSegFault sources.

The point is, the stack trace must be dumped by the process itself; it is not accessible to the parent. To do that, you inject code into the child process using e.g. LD_PRELOAD environment variable (which is one of the dynamic linker control variables in Linux). (Note that the stack tracing etc. is done in a signal handler context, so only async-signal-safe functions should be used.)

For example, the parent process can create a pipe, and move its write end to a specific descriptor in the child process before executing the target process, with your helper preload library path in LD_PRELOAD.

The helper preload library interposes signal(), sigaction(), and possibly sigprocmask(), sigwait(), sigwaitinfo(), pthread_sigmask(), to ensure the helper librarys crash dump signal handlers are executed when such a signal is delivered (SIGSEGV, SIGBUS, SIGILL, possibly SIGTRAP). The signal handler does the stack dump (and prints /proc/PID/maps), then sets the signal disposition to default, and re-raises the signal (using raise()).

Essentially, it boils down to doing the same as above libSegFault, except with your own C code.


If you don't want to inject code to the child process, or managing the signal handlers is too complicated, you can use ptrace instead.

When the tracee is killed by a signal (other than SIGKILL), the thread receiving the signal is stopped first ("signal-delivery-stop"), so the tracer can examine its stack (and memory map of the tracee), before letting the child process continue/die.

In practice, ptracing is more invasive, as there are many events that cause the tracees threads to stop. It is also much more complicated for multithreaded processes than the LD_PRELOAD approach, because ptrace can control individual threads in the tracee; there are much more details to get right.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86