687

I am working on Linux with the GCC compiler. When my C++ program crashes I would like it to automatically generate a stacktrace.

My program is being run by many different users and it also runs on Linux, Windows and Macintosh (all versions are compiled using gcc).

I would like my program to be able to generate a stack trace when it crashes and the next time the user runs it, it will ask them if it is ok to send the stack trace to me so I can track down the problem. I can handle the sending the info to me but I don't know how to generate the trace string. Any ideas?

jww
  • 97,681
  • 90
  • 411
  • 885
KPexEA
  • 16,560
  • 16
  • 61
  • 78
  • 4
    backtrace and backtrace_symbols_fd are not async-signal-safe. you should not use these function in signal handler – Parag Bafna Jun 14 '12 at 09:08
  • 13
    backtrace_symbols calls malloc, and so must not be used in a signal handler. The other two functions (backtrace and backtrace_symbols_fd) do not have this problem, and are commonly used in signal handlers. – cmccabe Aug 02 '12 at 20:01
  • 3
    @cmccabe that is incorrect backtrace_symbols_fd usually does not call malloc but may if something goes wrong in its catch_error block – Sam Saffron Dec 17 '13 at 22:24
  • 7
    It "may" in the sense that there is no POSIX spec for backtrace_symbols_fd (or any backtrace); however, GNU/Linux's backtrace_symbols_fd is specified to never call malloc, as per http://linux.die.net/man/3/backtrace_symbols_fd . Therefore, it is safe to assume that it will never call malloc on Linux. – codetaku Jul 17 '14 at 14:42
  • Not sure whether unhandled exceptions qualify as "program crash", but the method to print a stacktrace when exceptions are thrown described [in this answer](https://stackoverflow.com/a/11674810/651937) might also be interesting for you. – ingomueller.net Sep 06 '19 at 07:19
  • 1
    Are there any better solutions to this problem in the year 2021? I just want to print a stack trace like in Java or Python. – stackoverflowuser2010 Sep 17 '21 at 02:52

31 Answers31

590

For Linux and I believe Mac OS X, if you're using gcc, or any compiler that uses glibc, you can use the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault. Documentation can be found in the libc manual.

Here's an example program that installs a SIGSEGV handler and prints a stacktrace to stderr when it segfaults. The baz() function here causes the segfault that triggers the handler:

#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>


void handler(int sig) {
  void *array[10];
  size_t size;

  // get void*'s for all entries on the stack
  size = backtrace(array, 10);

  // print out all the frames to stderr
  fprintf(stderr, "Error: signal %d:\n", sig);
  backtrace_symbols_fd(array, size, STDERR_FILENO);
  exit(1);
}

void baz() {
 int *foo = (int*)-1; // make a bad pointer
  printf("%d\n", *foo);       // causes segfault
}

void bar() { baz(); }
void foo() { bar(); }


int main(int argc, char **argv) {
  signal(SIGSEGV, handler);   // install our handler
  foo(); // this will call foo, bar, and baz.  baz segfaults.
}

Compiling with -g -rdynamic gets you symbol info in your output, which glibc can use to make a nice stacktrace:

$ gcc -g -rdynamic ./test.c -o test

Executing this gets you this output:

$ ./test
Error: signal 11:
./test(handler+0x19)[0x400911]
/lib64/tls/libc.so.6[0x3a9b92e380]
./test(baz+0x14)[0x400962]
./test(bar+0xe)[0x400983]
./test(foo+0xe)[0x400993]
./test(main+0x28)[0x4009bd]
/lib64/tls/libc.so.6(__libc_start_main+0xdb)[0x3a9b91c4bb]
./test[0x40086a]

This shows the load module, offset, and function that each frame in the stack came from. Here you can see the signal handler on top of the stack, and the libc functions before main in addition to main, foo, bar, and baz.

Violet Giraffe
  • 32,368
  • 48
  • 194
  • 335
Todd Gamblin
  • 58,354
  • 15
  • 89
  • 96
  • 58
    There's also /lib/libSegFault.so which you can use with LD_PRELOAD. – CesarB Oct 23 '08 at 15:05
  • 7
    It looks like the first two entries in your backtrace output contain a return address inside the signal handler and probably one inside `sigaction()` in libc. While your backtrace appears to be correct, I have sometimes found that additional steps are necessary to ensure the actual location of the fault appears in the backtrace as it can be overwritten with `sigaction()` by the kernel. – jschmier Mar 27 '10 at 19:11
  • 9
    What would happen if the crash comes from inside malloc? Wouldn't you then hold a lock and then get stuck as "backtrace" tries to allocate memory? – Mattias Nilsson Apr 17 '12 at 06:39
  • 3
    You could then try some other stackwalking API, e.g.: DynInst's StackwalkerAPI http://www.dyninst.org/stackwalkerapi or http://www.nongnu.org/libunwind/. Generally if you expect to walk out of stack frames or interrupt frames inside malloc, you need to do special things to handle it. Many tools use their own arena allocator to avoid conflicting with the libc malloc in situations like this. – Todd Gamblin Apr 18 '12 at 00:19
  • 5
    backtrace and backtrace_symbols_fd are not async-signal-safe. you should not use these function in signal handler. – Parag Bafna Jun 14 '12 at 07:45
  • 2
    @ParagBafna, then **what** can we use for backtraces that is async-signal safe? – lurscher Oct 19 '12 at 02:16
  • 8
    `catchsegv` is not what the OP needs but is awesome for catching segmentation faults and getting all the information. – Matt Clarkson Jan 30 '13 at 10:45
  • 13
    For ARM, I had to also compile with -funwind-tables. Otherwise my stack depth was always 1 (empty). – jfritz42 Apr 10 '13 at 20:17
  • 2
    You can't call `exit()` safely from a signal handler. Use `_exit()` or `_Exit()`. – Carl Norum May 30 '13 at 17:35
  • @Olshansk - the same with me – ducin Jan 20 '14 at 10:27
  • 5
    Using an exit function prevents you from getting a core dump and masks the reason for exiting from the parents wait call. I usually set SA_RESETHAND to unset my sigaction signal handler after it runs, then call raise(sig) to re-raise the signal. If core dumps are enabled, then you'll get both the backtrace and the core dump. – Paul Coccoli Mar 05 '14 at 14:17
  • 3
    If you are planning to use the above code somewhere else than in a function that eventually `exit()`, please note that per `backtrace_symbol_fd(3)`, you need to free the the `array` after you are done with it. – Marco83 May 22 '14 at 09:46
  • 4
    According to `man 7 signal`, Every single one of the functions that your handler calls is not async-signal-safe, leading to undefined behaviour. In practice, your process might deadlock (with `-lpthread`), or corrupt its memory (without). – mic_e Aug 19 '14 at 12:18
  • 5
    @mic_e: True! But this is a handler to call when your application crashes. In other words, your process has already bitten the dust, and you are in very unsafe territory. You call this in a signal handler because you want to know where the error happened, and it works in practice. If it fails (and I have not seen that happen often), you've lost nothing because your process was already dying to begin with. – Todd Gamblin Aug 22 '14 at 23:19
  • @anyone, how to generate stacktrace (in my logs) and coredump together? If we define a signal handler then we will not get coredump. I want both, suggest me a good solution. Thanks in advance. – uss Jul 24 '15 at 07:18
  • 1
    @sree: You need to add some code so that the handler unregisters itself and reverts to default signal handling, then kills itself. Adding "signal(sig, SIG_DFL); kill(getpid(), sig);" at the end of the handler should work. Example here: http://www.alexonlinux.com/how-to-handle-sigsegv-but-also-generate-core-dump – Todd Gamblin Jul 24 '15 at 08:22
  • @tgamblin: I had tried this technique but its completely pointing to a wrong location when I do hit a crash! :(; But my backtrace() works fine, its pointing to the exact location. But I need to have a coredump with the exact informoation along with the backtrace printed into my logs. And what are all the disadvantage my program would endup by using signal(signum, SIG_DFL)? Thank you for your Answer. – uss Jul 25 '15 at 12:33
  • Is there an alternative for windows? – Jack Feb 06 '16 at 20:29
  • `*(int*)0=1;` would be suffice to generate segfault instead of 2 lines – gjois Nov 17 '16 at 03:57
  • @sree for SIGSEGV, SIGILL, SIGFPE and SIGBUS just returning from the signal handler will re-raise the signal at the original location, thus giving you a decent core dump. For SIGABRT the same will happen if and only if you received it as a result of `abort()`. As an other option: try registering your signal handler with `sigaction` _without_ using `SA_NODEFER` (`signal` is usually equivalent to `SA_RESETHAND | SA_NODEFER`). The signal will then only be delivered _after_ returning from the signal handler. – Giel Jun 07 '18 at 14:41
  • `fprintf(stderr, "Error: signal %d:\n", sig);` in a `SIGSEGV` handler will deadlock if `fprintf()` calls `malloc()` or `free()` and the `SIGSEGV` occurred while also in a `malloc()` or `free()` (or similar) call - which is quite common. So no, it doesn't really "work in practice". – Andrew Henle Dec 19 '19 at 10:04
  • 1
    In addition to using `-rdynamic`, also check that your build system doesn't add `-fvisibility=hidden` option! (as it will completely discard the effect of `-rdynamic`) – Dima Litvinov May 13 '20 at 00:42
  • @ToddGamblin *True! But this is a handler to call when your application crashes.* No, this is a handler that gets called when your application is ***about to crash***. It hasn't crashed yet, but the most likely failure this crap code will intercept is a `SIGSEGV` caused by a corrupted heap, which will likely be discovered in a call to something like `malloc()` or `free()`. And since this code depends on code that will call `malloc()`, ***it's going to deadlock and leave your application stuck where it will NEVER exit***. – Andrew Henle Jan 21 '21 at 20:49
174

It's even easier than "man backtrace", there's a little-documented library (GNU specific) distributed with glibc as libSegFault.so, which was I believe was written by Ulrich Drepper to support the program catchsegv (see "man catchsegv").

This gives us 3 possibilities. Instead of running "program -o hai":

  1. Run within catchsegv:

    $ catchsegv program -o hai
    
  2. Link with libSegFault at runtime:

    $ LD_PRELOAD=/lib/libSegFault.so program -o hai
    
  3. Link with libSegFault at compile time:

    $ gcc -g1 -lSegFault -o program program.cc
    $ program -o hai
    

In all 3 cases, you will get clearer backtraces with less optimization (gcc -O0 or -O1) and debugging symbols (gcc -g). Otherwise, you may just end up with a pile of memory addresses.

You can also catch more signals for stack traces with something like:

$ export SEGFAULT_SIGNALS="all"       # "all" signals
$ export SEGFAULT_SIGNALS="bus abrt"  # SIGBUS and SIGABRT

The output will look something like this (notice the backtrace at the bottom):

*** Segmentation fault Register dump:

 EAX: 0000000c   EBX: 00000080   ECX:
00000000   EDX: 0000000c  ESI:
bfdbf080   EDI: 080497e0   EBP:
bfdbee38   ESP: bfdbee20

 EIP: 0805640f   EFLAGS: 00010282

 CS: 0073   DS: 007b   ES: 007b   FS:
0000   GS: 0033   SS: 007b

 Trap: 0000000e   Error: 00000004  
OldMask: 00000000  ESP/signal:
bfdbee20   CR2: 00000024

 FPUCW: ffff037f   FPUSW: ffff0000  
TAG: ffffffff  IPOFF: 00000000  
CSSEL: 0000   DATAOFF: 00000000  
DATASEL: 0000

 ST(0) 0000 0000000000000000   ST(1)
0000 0000000000000000  ST(2) 0000
0000000000000000   ST(3) 0000
0000000000000000  ST(4) 0000
0000000000000000   ST(5) 0000
0000000000000000  ST(6) 0000
0000000000000000   ST(7) 0000
0000000000000000

Backtrace:
/lib/libSegFault.so[0xb7f9e100]
??:0(??)[0xb7fa3400]
/usr/include/c++/4.3/bits/stl_queue.h:226(_ZNSt5queueISsSt5dequeISsSaISsEEE4pushERKSs)[0x805647a]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/player.cpp:73(_ZN6Player5inputESs)[0x805377c]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:159(_ZN6Socket4ReadEv)[0x8050698]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:413(_ZN12ServerSocket4ReadEv)[0x80507ad]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:300(_ZN12ServerSocket4pollEv)[0x8050b44]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/main.cpp:34(main)[0x8049a72]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7d1b775]
/build/buildd/glibc-2.9/csu/../sysdeps/i386/elf/start.S:122(_start)[0x8049801]

If you want to know the gory details, the best source is unfortunately the source: See http://sourceware.org/git/?p=glibc.git;a=blob;f=debug/segfault.c and its parent directory http://sourceware.org/git/?p=glibc.git;a=tree;f=debug

dequis
  • 2,100
  • 19
  • 25
jhclark
  • 2,493
  • 1
  • 20
  • 14
  • 1
    "Possibility 3. Link with libSegFault at compile time" does not work. – HHK Jan 23 '13 at 18:05
  • 5
    @crafter: What do you mean "does not work". What have you tried, on what language/compiler/toolchain/distribution/hardware ? Did it fail to compile ? To catch error ? To produce output at all ? To produce hard-to-use output ? Thank you for details it will help everyone. – Stéphane Gourichon Mar 31 '14 at 09:33
  • 2
    'best source is unfortunately the source' ... Hopefully, some day, the man page for catchsegv will actually mention SEGFAULT_SIGNALS. Until then, there's this answer to refer to. – greggo Jul 03 '14 at 16:06
  • I can't believe I've been programming C for 5 years and never heard of this :/ – DavidMFrey Mar 16 '16 at 12:44
  • 13
    @StéphaneGourichon @HansKratz To link with libSegFault you'll have to add `-Wl,--no-as-needed` to the compiler flags. Otherwise, `ld` will indeed *not* link against `libSegFault`, because it recognizes that the binary doesn't use any of its symbols. – Phillip Jul 28 '16 at 08:49
  • Does using this impact performance? – user2233706 Mar 14 '22 at 19:27
  • 4
    `catchsegv` and `libSegFault` have both been removed in `glibc` 2.35: https://savannah.gnu.org/forum/forum.php?forum_id=10111 It looks like `gdb -ex=r --args ` may be a partial substitute. – Will Chen Oct 08 '22 at 17:12
130

Linux

While the use of the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault has already been suggested, I see no mention of the intricacies necessary to ensure the resulting backtrace points to the actual location of the fault (at least for some architectures - x86 & ARM).

The first two entries in the stack frame chain when you get into the signal handler contain a return address inside the signal handler and one inside sigaction() in libc. The stack frame of the last function called before the signal (which is the location of the fault) is lost.

Code

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#ifndef __USE_GNU
#define __USE_GNU
#endif

#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucontext.h>
#include <unistd.h>

/* This structure mirrors the one found in /usr/include/asm/ucontext.h */
typedef struct _sig_ucontext {
 unsigned long     uc_flags;
 ucontext_t        *uc_link;
 stack_t           uc_stack;
 sigcontext_t      uc_mcontext;
 sigset_t          uc_sigmask;
} sig_ucontext_t;

void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
 void *             array[50];
 void *             caller_address;
 char **            messages;
 int                size, i;
 sig_ucontext_t *   uc;

 uc = (sig_ucontext_t *)ucontext;

 /* Get the address at the time the signal was raised */
#if defined(__i386__) // gcc specific
 caller_address = (void *) uc->uc_mcontext.eip; // EIP: x86 specific
#elif defined(__x86_64__) // gcc specific
 caller_address = (void *) uc->uc_mcontext.rip; // RIP: x86_64 specific
#else
#error Unsupported architecture. // TODO: Add support for other arch.
#endif

 fprintf(stderr, "signal %d (%s), address is %p from %p\n", 
  sig_num, strsignal(sig_num), info->si_addr, 
  (void *)caller_address);

 size = backtrace(array, 50);

 /* overwrite sigaction with caller's address */
 array[1] = caller_address;

 messages = backtrace_symbols(array, size);

 /* skip first stack frame (points here) */
 for (i = 1; i < size && messages != NULL; ++i)
 {
  fprintf(stderr, "[bt]: (%d) %s\n", i, messages[i]);
 }

 free(messages);

 exit(EXIT_FAILURE);
}

int crash()
{
 char * p = NULL;
 *p = 0;
 return 0;
}

int foo4()
{
 crash();
 return 0;
}

int foo3()
{
 foo4();
 return 0;
}

int foo2()
{
 foo3();
 return 0;
}

int foo1()
{
 foo2();
 return 0;
}

int main(int argc, char ** argv)
{
 struct sigaction sigact;

 sigact.sa_sigaction = crit_err_hdlr;
 sigact.sa_flags = SA_RESTART | SA_SIGINFO;

 if (sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL) != 0)
 {
  fprintf(stderr, "error setting signal handler for %d (%s)\n",
    SIGSEGV, strsignal(SIGSEGV));

  exit(EXIT_FAILURE);
 }

 foo1();

 exit(EXIT_SUCCESS);
}

Output

signal 11 (Segmentation fault), address is (nil) from 0x8c50
[bt]: (1) ./test(crash+0x24) [0x8c50]
[bt]: (2) ./test(foo4+0x10) [0x8c70]
[bt]: (3) ./test(foo3+0x10) [0x8c8c]
[bt]: (4) ./test(foo2+0x10) [0x8ca8]
[bt]: (5) ./test(foo1+0x10) [0x8cc4]
[bt]: (6) ./test(main+0x74) [0x8d44]
[bt]: (7) /lib/libc.so.6(__libc_start_main+0xa8) [0x40032e44]

All the hazards of calling the backtrace() functions in a signal handler still exist and should not be overlooked, but I find the functionality I described here quite helpful in debugging crashes.

It is important to note that the example I provided is developed/tested on Linux for x86. I have also successfully implemented this on ARM using uc_mcontext.arm_pc instead of uc_mcontext.eip.

Here's a link to the article where I learned the details for this implementation: http://www.linuxjournal.com/article/6391

Étienne
  • 4,773
  • 2
  • 33
  • 58
jschmier
  • 15,458
  • 6
  • 54
  • 72
  • 11
    On systems using GNU ld, remember to compile with `-rdynamic` to instruct the linker to add all symbols, not only used ones, to the dynamic symbol table. This allows `backtrace_symbols()` to convert addresses to function names – jschmier Mar 26 '10 at 20:00
  • The output in the example above was taken from an test program compiled using a gcc-3.4.5-glibc-2.3.6 cross-toolchain and executed on an ARMv6-based platform running Linux Kernel 2.6.22. – jschmier May 24 '10 at 23:54
  • enabling backtrace support is only meaningful when compiling for the Thumb mode in ARM – manav m-n Oct 17 '11 at 06:58
  • 2
    Also, you need to add "-mapcs-frame" option to GCC''s command line to generate stack frames on ARM platform – qehgt Feb 01 '12 at 15:53
  • 3
    This may be too late but can we use `addr2line` command somehow to get the exact line where the crash occurred? – enthusiasticgeek Oct 24 '12 at 18:26
  • 5
    On more recent builds of `glibc` `uc_mcontext` does not contain a field named `eip`. There is now an array that needs to be indexed, `uc_mcontext.gregs[REG_EIP]` is the equivalent. – mmlb Dec 14 '12 at 14:57
  • @enthusiasticgeek, I created a little bash script to feed the output of jschmier's 2nd answer into the addr2line utility. Thanks for drawing my attention to that tool! See: http://stackoverflow.com/a/15801966/1797414 – arr_sea Apr 04 '13 at 03:26
  • 7
    For ARM, my backtraces always had depth 1 until I added the -funwind-tables option to the compiler. – jfritz42 Apr 10 '13 at 16:10
  • 1
    @letmaik, it's been a while since I took a crack at this, but in x86_64, the instruction pointer is RIP, not EIP. Perhaps you need to index the `uc_mcontext` array as `uc_mcontext.gregs[REG_RIP]`. – jschmier Jan 18 '17 at 23:45
  • 1
    `struct sigcontext` cannot be found by macOS 10.14, which is POSIX-compliant, so this is a glibc-specific program. Also, on Ubuntu 16.04 (Linux 4.4.0 as kernel), the line containing the symbol of `crash()` appears twice, instead of only once as displayed in the answer. – Leedehai Oct 10 '18 at 07:02
  • On x86_64 (Ubuntu 18.04) This statement does not appear to apply? `The stack frame of the last function called before the signal (which is the location of the fault) is lost.` Without injecting the pointer obtained through `uc_mcontext`, the location of the segfault is shown as the fourth entry in the backtrace. – Steven Lu Nov 21 '19 at 00:41
95

Even though a correct answer has been provided that describes how to use the GNU libc backtrace() function1 and I provided my own answer that describes how to ensure a backtrace from a signal handler points to the actual location of the fault2, I don't see any mention of demangling C++ symbols output from the backtrace.

When obtaining backtraces from a C++ program, the output can be run through c++filt1 to demangle the symbols or by using abi::__cxa_demangle1 directly.

  • 1 Linux & OS X Note that c++filt and __cxa_demangle are GCC specific
  • 2 Linux

The following C++ Linux example uses the same signal handler as my other answer and demonstrates how c++filt can be used to demangle the symbols.

Code:

class foo
{
public:
    foo() { foo1(); }

private:
    void foo1() { foo2(); }
    void foo2() { foo3(); }
    void foo3() { foo4(); }
    void foo4() { crash(); }
    void crash() { char * p = NULL; *p = 0; }
};

int main(int argc, char ** argv)
{
    // Setup signal handler for SIGSEGV
    ...

    foo * f = new foo();
    return 0;
}

Output (./test):

signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(crash__3foo+0x13) [0x8048e07]
[bt]: (2) ./test(foo4__3foo+0x12) [0x8048dee]
[bt]: (3) ./test(foo3__3foo+0x12) [0x8048dd6]
[bt]: (4) ./test(foo2__3foo+0x12) [0x8048dbe]
[bt]: (5) ./test(foo1__3foo+0x12) [0x8048da6]
[bt]: (6) ./test(__3foo+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]

Demangled Output (./test 2>&1 | c++filt):

signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(foo::crash(void)+0x13) [0x8048e07]
[bt]: (2) ./test(foo::foo4(void)+0x12) [0x8048dee]
[bt]: (3) ./test(foo::foo3(void)+0x12) [0x8048dd6]
[bt]: (4) ./test(foo::foo2(void)+0x12) [0x8048dbe]
[bt]: (5) ./test(foo::foo1(void)+0x12) [0x8048da6]
[bt]: (6) ./test(foo::foo(void)+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]

The following builds on the signal handler from my original answer and can replace the signal handler in the above example to demonstrate how abi::__cxa_demangle can be used to demangle the symbols. This signal handler produces the same demangled output as the above example.

Code:

void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
    sig_ucontext_t * uc = (sig_ucontext_t *)ucontext;

    void * caller_address = (void *) uc->uc_mcontext.eip; // x86 specific

    std::cerr << "signal " << sig_num 
              << " (" << strsignal(sig_num) << "), address is " 
              << info->si_addr << " from " << caller_address 
              << std::endl << std::endl;

    void * array[50];
    int size = backtrace(array, 50);

    array[1] = caller_address;

    char ** messages = backtrace_symbols(array, size);    

    // skip first stack frame (points here)
    for (int i = 1; i < size && messages != NULL; ++i)
    {
        char *mangled_name = 0, *offset_begin = 0, *offset_end = 0;

        // find parantheses and +address offset surrounding mangled name
        for (char *p = messages[i]; *p; ++p)
        {
            if (*p == '(') 
            {
                mangled_name = p; 
            }
            else if (*p == '+') 
            {
                offset_begin = p;
            }
            else if (*p == ')')
            {
                offset_end = p;
                break;
            }
        }

        // if the line could be processed, attempt to demangle the symbol
        if (mangled_name && offset_begin && offset_end && 
            mangled_name < offset_begin)
        {
            *mangled_name++ = '\0';
            *offset_begin++ = '\0';
            *offset_end++ = '\0';

            int status;
            char * real_name = abi::__cxa_demangle(mangled_name, 0, 0, &status);

            // if demangling is successful, output the demangled function name
            if (status == 0)
            {    
                std::cerr << "[bt]: (" << i << ") " << messages[i] << " : " 
                          << real_name << "+" << offset_begin << offset_end 
                          << std::endl;

            }
            // otherwise, output the mangled function name
            else
            {
                std::cerr << "[bt]: (" << i << ") " << messages[i] << " : " 
                          << mangled_name << "+" << offset_begin << offset_end 
                          << std::endl;
            }
            free(real_name);
        }
        // otherwise, print the whole line
        else
        {
            std::cerr << "[bt]: (" << i << ") " << messages[i] << std::endl;
        }
    }
    std::cerr << std::endl;

    free(messages);

    exit(EXIT_FAILURE);
}
Community
  • 1
  • 1
jschmier
  • 15,458
  • 6
  • 54
  • 72
  • 1
    Thank you for this, jschmier. I created a little bash script to feed the output of this into the addr2line utility. See: stackoverflow.com/a/15801966/1797414 – arr_sea Apr 05 '13 at 19:02
  • 6
    Don't forget to #include – Bamaco Jul 07 '14 at 19:52
  • 1
    Good documentation, and a straightforward header file has been posted here since 2008... http://panthema.net/2008/0901-stacktrace-demangled/ very similar to your approach :) – Kevin Oct 23 '14 at 20:25
  • abi::__cxa_demangle seems to be not the async-signal-safe, so the signal handler can deadlock somewhere in malloc. – orcy Nov 27 '15 at 06:41
  • 2
    The use of `std::cerr`, `free()` and `exit()` all violate restrictions against calling non-async-signal-safe calls on POSIX systems. **This code will deadlock if your process fails in any call such as `free()`, `malloc()` `new`, or `detete`.** – Andrew Henle Feb 24 '20 at 17:37
  • @AndrewHenle That's great! Now what do we do about that? – Andrew Nov 10 '20 at 06:47
  • I tried this anyways and the line numbers appear to be in hexadecimal but then converting them results in like 28k type numbers which are way too big for my files so I quit this approach... – Andrew Nov 10 '20 at 07:27
  • @Andrew *That's great! Now what do we do about that?* Doctor it hurts when I do this? So don't do that! **Don't use non-async-signal-safe calls such as `std::cerr`, `free()`, or `exit()`.** This isn't hard. Or even `backtrace_symbols()`. Use low-level `write()` to `STDERR_FILENO`. The [`backtrace_symbols()` man page](https://man7.org/linux/man-pages/man3/backtrace.3.html) even states "`backtrace_symbols_fd()` does not call malloc(3), and so can be employed in situations where the latter function might fail." – Andrew Henle Nov 10 '20 at 10:27
  • @AndrewHenle I think you're slightly losing track of the intended use of this code - it's to print a stack when a program crashes, so if in certain circumstances it might deadlock the program that isn't a big deal as the program has already failed. On a practical level, I'd suggest putting out some sort of message before making calls that might deadlock so that the developer knows what is happening (which the above code does do, I think). – Mike Moreton Feb 09 '21 at 08:20
  • @MikeMoreton *so if in certain circumstances it might deadlock the program that isn't a big deal as the program has already failed* Wonderful - now your critical process hangs and doesn't get restarted. Wow, that's robust. You're advocating an awfully low standard of reliability. Every single time you write code like that, you've reduced the reliability of your system. When you write crap code like that over thousands of lines of code, you get a crap product. But hey, if that's good enough for you... – Andrew Henle Feb 09 '21 at 10:06
34

Might be worth looking at Google Breakpad, a cross-platform crash dump generator and tools to process the dumps.

Simon Steele
  • 11,558
  • 4
  • 45
  • 67
  • It reports on stuff like segmentation faults, but it doesn't report any info on unhandled C++ exceptions. – DBedrenko Aug 02 '16 at 11:15
22

You did not specify your operating system, so this is difficult to answer. If you are using a system based on gnu libc, you might be able to use the libc function backtrace().

GCC also has two builtins that can assist you, but which may or may not be implemented fully on your architecture, and those are __builtin_frame_address and __builtin_return_address. Both of which want an immediate integer level (by immediate, I mean it can't be a variable). If __builtin_frame_address for a given level is non-zero, it should be safe to grab the return address of the same level.

user
  • 5,335
  • 7
  • 47
  • 63
Brian Mitchell
  • 2,280
  • 14
  • 12
14

Thank you to enthusiasticgeek for drawing my attention to the addr2line utility.

I've written a quick and dirty script to process the output of the answer provided here: (much thanks to jschmier!) using the addr2line utility.

The script accepts a single argument: The name of the file containing the output from jschmier's utility.

The output should print something like the following for each level of the trace:

BACKTRACE:  testExe 0x8A5db6b
FILE:       pathToFile/testExe.C:110
FUNCTION:   testFunction(int) 
   107  
   108           
   109           int* i = 0x0;
  *110           *i = 5;
   111      
   112        }
   113        return i;

Code:

#!/bin/bash

LOGFILE=$1

NUM_SRC_CONTEXT_LINES=3

old_IFS=$IFS  # save the field separator           
IFS=$'\n'     # new field separator, the end of line           

for bt in `cat $LOGFILE | grep '\[bt\]'`; do
   IFS=$old_IFS     # restore default field separator 
   printf '\n'
   EXEC=`echo $bt | cut -d' ' -f3 | cut -d'(' -f1`  
   ADDR=`echo $bt | cut -d'[' -f3 | cut -d']' -f1`
   echo "BACKTRACE:  $EXEC $ADDR"
   A2L=`addr2line -a $ADDR -e $EXEC -pfC`
   #echo "A2L:        $A2L"

   FUNCTION=`echo $A2L | sed 's/\<at\>.*//' | cut -d' ' -f2-99`
   FILE_AND_LINE=`echo $A2L | sed 's/.* at //'`
   echo "FILE:       $FILE_AND_LINE"
   echo "FUNCTION:   $FUNCTION"

   # print offending source code
   SRCFILE=`echo $FILE_AND_LINE | cut -d':' -f1`
   LINENUM=`echo $FILE_AND_LINE | cut -d':' -f2`
   if ([ -f $SRCFILE ]); then
      cat -n $SRCFILE | grep -C $NUM_SRC_CONTEXT_LINES "^ *$LINENUM\>" | sed "s/ $LINENUM/*$LINENUM/"
   else
      echo "File not found: $SRCFILE"
   fi
   IFS=$'\n'     # new field separator, the end of line           
done

IFS=$old_IFS     # restore default field separator 
Community
  • 1
  • 1
arr_sea
  • 841
  • 10
  • 16
14

ulimit -c <value> sets the core file size limit on unix. By default, the core file size limit is 0. You can see your ulimit values with ulimit -a.

also, if you run your program from within gdb, it will halt your program on "segmentation violations" (SIGSEGV, generally when you accessed a piece of memory that you hadn't allocated) or you can set breakpoints.

ddd and nemiver are front-ends for gdb which make working with it much easier for the novice.

user
  • 5,335
  • 7
  • 47
  • 63
  • 6
    Core dumps are infinitely more useful than stack traces because you can load the core dump in the debugger and see the state of the whole program and its data at the point of the crash. – Adam Hawes Feb 04 '09 at 13:07
  • 1
    The backtrace facility that others have suggested is probably better than nothing, but it is very basic -- it doesn't even give line numbers. Using core dumps, on the other hand, let's you retroactively view the entire state of your application at the time it crashed (including a detailed stack trace). There *might* be practical issues with trying to use this for field debugging, but it is definitely a more powerful tool for analyzing crashes and asserts during development (at least on Linux). – Brent Bradburn Oct 26 '10 at 13:36
13

It looks like in one of last c++ boost version appeared library to provide exactly what You want, probably the code would be multiplatform. It is boost::stacktrace, which You can use like as in boost sample:

#include <filesystem>
#include <sstream>
#include <fstream>
#include <signal.h>     // ::signal, ::raise
#include <boost/stacktrace.hpp>

const char* backtraceFileName = "./backtraceFile.dump";

void signalHandler(int)
{
    ::signal(SIGSEGV, SIG_DFL);
    ::signal(SIGABRT, SIG_DFL);
    boost::stacktrace::safe_dump_to(backtraceFileName);
    ::raise(SIGABRT);
}

void sendReport()
{
    if (std::filesystem::exists(backtraceFileName))
    {
        std::ifstream file(backtraceFileName);

        auto st = boost::stacktrace::stacktrace::from_dump(file);
        std::ostringstream backtraceStream;
        backtraceStream << st << std::endl;

        // sending the code from st

        file.close();
        std::filesystem::remove(backtraceFileName);
    }
}

int main()
{
    ::signal(SIGSEGV, signalHandler);
    ::signal(SIGABRT, signalHandler);

    sendReport();
    // ... rest of code
}

In Linux You compile the code above:

g++ --std=c++17 file.cpp -lstdc++fs -lboost_stacktrace_backtrace -ldl -lbacktrace

Example backtrace copied from boost documentation:

0# bar(int) at /path/to/source/file.cpp:70
1# bar(int) at /path/to/source/file.cpp:70
2# bar(int) at /path/to/source/file.cpp:70
3# bar(int) at /path/to/source/file.cpp:70
4# main at /path/to/main.cpp:93
5# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
6# _start
baziorek
  • 2,502
  • 2
  • 29
  • 43
13

It's important to note that once you generate a core file you'll need to use the gdb tool to look at it. For gdb to make sense of your core file, you must tell gcc to instrument the binary with debugging symbols: to do this, you compile with the -g flag:

$ g++ -g prog.cpp -o prog

Then, you can either set "ulimit -c unlimited" to let it dump a core, or just run your program inside gdb. I like the second approach more:

$ gdb ./prog
... gdb startup output ...
(gdb) run
... program runs and crashes ...
(gdb) where
... gdb outputs your stack trace ...

I hope this helps.

Benson
  • 22,457
  • 2
  • 40
  • 49
  • 5
    You can also call `gdb` right from your crashing program. Setup handler for SIGSEGV, SEGILL, SIGBUS, SIGFPE that will call gdb. Details: http://stackoverflow.com/questions/3151779/how-its-better-to-invoke-gdb-from-program-to-print-its-stacktrace The advantage is that you get beautiful, annotated backtrace like in `bt full`, also you can get stack traces of all threads. – Vi. Jun 30 '10 at 22:20
  • You can also get backtrace easier than in the answer: gdb -silent ./prog core --eval-command=backtrace --batch -it would show backtrace and close debugger – baziorek Jan 02 '19 at 08:00
11

The new king in town has arrived https://github.com/bombela/backward-cpp

1 header to place in your code and 1 library to install.

Personally I call it using this function

#include "backward.hpp"
void stacker() {

using namespace backward;
StackTrace st;


st.load_here(99); //Limit the number of trace depth to 99
st.skip_n_firsts(3);//This will skip some backward internal function from the trace

Printer p;
p.snippet = true;
p.object = true;
p.color = true;
p.address = true;
p.print(st, stderr);
}
Roy
  • 322
  • 3
  • 6
  • 1
    Wow! That's finally how it should be done! I have just dumped by own solution in favor of this one. – tglas Dec 01 '19 at 20:09
  • 1
    I don't see how this is could solve the issue. You have to call it within the same place where the exception is thrown by catching it and throwing it again after using this library (As their examples clarifies). Please correct me if I'm wrong but this isn't useful in the case of program crashes – Mazen Ak Nov 09 '21 at 09:03
  • 1
    @MazenAk you can install an event handler that catched the SIGSEV and SIGABRT check out https://github.com/bombela/backward-cpp#signalhandling – Roy Nov 11 '21 at 13:15
  • Thanks man, I've been reading the README file over days and I didn't notice such part, will give it a try today. – Mazen Ak Nov 11 '21 at 13:35
11

Ive been looking at this problem for a while.

And buried deep in the Google Performance Tools README

http://code.google.com/p/google-perftools/source/browse/trunk/README

talks about libunwind

http://www.nongnu.org/libunwind/

Would love to hear opinions of this library.

The problem with -rdynamic is that it can increase the size of the binary relatively significantly in some cases

Gregory
  • 1,479
  • 15
  • 22
  • 2
    On x86/64, I have not seen -rdynamic increase binary size much. Adding -g makes for a much bigger increase. – Dan Mar 24 '10 at 06:46
  • 1
    I noticed that libunwind does not have functionality to get the line number, and I guess (did not test) unw_get_proc_name returns the function symbol (which is obfuscated for overloading and such) instead of the original name. – Herbert Nov 24 '14 at 20:10
  • 1
    That's correct. It gets very tricky to do this correctly, but I've had excellent success with gaddr2line there is lots of practical information here http://blog.bigpixel.ro/2010/09/stack-unwinding-stack-trace-with-gcc/ – Gregory Nov 25 '14 at 20:53
10

You can use DeathHandler - small C++ class which does everything for you, reliable.

markhor
  • 2,235
  • 21
  • 18
  • 2
    unfortunately it uses `execlp()` to perform addr2line calls... would be nice to fully stay in the own program (which is possible by including the addr2line code in some form) – example Aug 26 '14 at 14:39
10

Forget about changing your sources and do some hacks with backtrace() function or macroses - these are just poor solutions.

As a properly working solution, I would advice:

  1. Compile your program with "-g" flag for embedding debug symbols to binary (don't worry this will not impact your performance).
  2. On linux run next command: "ulimit -c unlimited" - to allow system make big crash dumps.
  3. When your program crashed, in the working directory you will see file "core".
  4. Run next command to print backtrace to stdout: gdb -batch -ex "backtrace" ./your_program_exe ./core

This will print proper readable backtrace of your program in human readable way (with source file names and line numbers). Moreover this approach will give you freedom to automatize your system: have a short script that checks if process created a core dump, and then send backtraces by email to developers, or log this into some logging system.

loopzilla
  • 216
  • 5
  • 5
10

Some versions of libc contain functions that deal with stack traces; you might be able to use them:

http://www.gnu.org/software/libc/manual/html_node/Backtraces.html

I remember using libunwind a long time ago to get stack traces, but it may not be supported on your platform.

Stephen Deken
  • 3,665
  • 26
  • 31
8

As a Windows-only solution, you can get the equivalent of a stack trace (with much, much more information) using Windows Error Reporting. With just a few registry entries, it can be set up to collect user-mode dumps:

Starting with Windows Server 2008 and Windows Vista with Service Pack 1 (SP1), Windows Error Reporting (WER) can be configured so that full user-mode dumps are collected and stored locally after a user-mode application crashes. [...]

This feature is not enabled by default. Enabling the feature requires administrator privileges. To enable and configure the feature, use the following registry values under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps key.

You can set the registry entries from your installer, which has the required privileges.

Creating a user-mode dump has the following advantages over generating a stack trace on the client:

  • It's already implemented in the system. You can either use WER as outlined above, or call MiniDumpWriteDump yourself, if you need more fine-grained control over the amount of information to dump. (Make sure to call it from a different process.)
  • Way more complete than a stack trace. Among others it can contain local variables, function arguments, stacks for other threads, loaded modules, and so on. The amount of data (and consequently size) is highly customizable.
  • No need to ship debug symbols. This both drastically decreases the size of your deployment, as well as makes it harder to reverse-engineer your application.
  • Largely independent of the compiler you use. Using WER does not even require any code. Either way, having a way to get a symbol database (PDB) is very useful for offline analysis. I believe GCC can either generate PDB's, or there are tools to convert the symbol database to the PDB format.

Take note, that WER can only be triggered by an application crash (i.e. the system terminating a process due to an unhandled exception). MiniDumpWriteDump can be called at any time. This may be helpful if you need to dump the current state to diagnose issues other than a crash.

Mandatory reading, if you want to evaluate the applicability of mini dumps:

Community
  • 1
  • 1
IInspectable
  • 46,945
  • 8
  • 85
  • 181
8
ulimit -c unlimited

is a system variable, wich will allow to create a core dump after your application crashes. In this case an unlimited amount. Look for a file called core in the very same directory. Make sure you compiled your code with debugging informations enabled!

regards

mana
  • 6,347
  • 6
  • 50
  • 70
7

Look at:

man 3 backtrace

And:

#include <exeinfo.h>
int backtrace(void **buffer, int size);

These are GNU extensions.

Stéphane
  • 19,459
  • 24
  • 95
  • 136
  • 2
    There may be additional examples to help out on this page I created a while back: http://charette.no-ip.com:81/programming/2010-01-25_Backtrace/ – Stéphane Oct 10 '10 at 07:05
6

See the Stack Trace facility in ACE (ADAPTIVE Communication Environment). It's already written to cover all major platforms (and more). The library is BSD-style licensed so you can even copy/paste the code if you don't want to use ACE.

Adam Mitz
  • 6,025
  • 1
  • 29
  • 28
5

I can help with the Linux version: the function backtrace, backtrace_symbols and backtrace_symbols_fd can be used. See the corresponding manual pages.

terminus
  • 13,745
  • 8
  • 34
  • 37
4

I have seen a lot of answers here performing a signal handler and then exiting. That's the way to go, but remember a very important fact: If you want to get the core dump for the generated error, you can't call exit(status). Call abort() instead!

jard18
  • 121
  • 1
  • 1
4

I found that @tgamblin solution is not complete. It cannot handle with stackoverflow. I think because by default signal handler is called with the same stack and SIGSEGV is thrown twice. To protect you need register an independent stack for the signal handler.

You can check this with code below. By default the handler fails. With defined macro STACK_OVERFLOW it's all right.

#include <iostream>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
#include <string>
#include <cassert>

using namespace std;

//#define STACK_OVERFLOW

#ifdef STACK_OVERFLOW
static char stack_body[64*1024];
static stack_t sigseg_stack;
#endif

static struct sigaction sigseg_handler;

void handler(int sig) {
  cerr << "sig seg fault handler" << endl;
  const int asize = 10;
  void *array[asize];
  size_t size;

  // get void*'s for all entries on the stack
  size = backtrace(array, asize);

  // print out all the frames to stderr
  cerr << "stack trace: " << endl;
  backtrace_symbols_fd(array, size, STDERR_FILENO);
  cerr << "resend SIGSEGV to get core dump" << endl;
  signal(sig, SIG_DFL);
  kill(getpid(), sig);
}

void foo() {
  foo();
}

int main(int argc, char **argv) {
#ifdef STACK_OVERFLOW
  sigseg_stack.ss_sp = stack_body;
  sigseg_stack.ss_flags = SS_ONSTACK;
  sigseg_stack.ss_size = sizeof(stack_body);
  assert(!sigaltstack(&sigseg_stack, nullptr));
  sigseg_handler.sa_flags = SA_ONSTACK;
#else
  sigseg_handler.sa_flags = SA_RESTART;  
#endif
  sigseg_handler.sa_handler = &handler;
  assert(!sigaction(SIGSEGV, &sigseg_handler, nullptr));
  cout << "sig action set" << endl;
  foo();
  return 0;
} 
Daniil Iaitskov
  • 5,525
  • 8
  • 39
  • 49
4

*nix: you can intercept SIGSEGV (usualy this signal is raised before crashing) and keep the info into a file. (besides the core file which you can use to debug using gdb for example).

win: Check this from msdn.

You can also look at the google's chrome code to see how it handles crashes. It has a nice exception handling mechanism.

INS
  • 10,594
  • 7
  • 58
  • 89
  • SEH does not help in producing a stack trace. While it could be part of a solution, that solution is harder to implement and provides less information at the expense of disclosing more information about your application than the *real* solution: Write a mini dump. And set up Windows to do this automatically for you. – IInspectable Feb 17 '18 at 14:28
3

If you still want to go it alone as I did you can link against bfd and avoid using addr2line as I have done here:

https://github.com/gnif/LookingGlass/blob/master/common/src/platform/linux/crash.c

This produces the output:

[E]        crash.linux.c:170  | crit_err_hdlr                  | ==== FATAL CRASH (a12-151-g28b12c85f4+1) ====
[E]        crash.linux.c:171  | crit_err_hdlr                  | signal 11 (Segmentation fault), address is (nil)
[E]        crash.linux.c:194  | crit_err_hdlr                  | [trace]: (0) /home/geoff/Projects/LookingGlass/client/src/main.c:936 (register_key_binds)
[E]        crash.linux.c:194  | crit_err_hdlr                  | [trace]: (1) /home/geoff/Projects/LookingGlass/client/src/main.c:1069 (run)
[E]        crash.linux.c:194  | crit_err_hdlr                  | [trace]: (2) /home/geoff/Projects/LookingGlass/client/src/main.c:1314 (main)
[E]        crash.linux.c:199  | crit_err_hdlr                  | [trace]: (3) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f8aa65f809b]
[E]        crash.linux.c:199  | crit_err_hdlr                  | [trace]: (4) ./looking-glass-client(_start+0x2a) [0x55c70fc4aeca]
twodayslate
  • 2,803
  • 3
  • 27
  • 43
Geoffrey
  • 10,843
  • 3
  • 33
  • 46
3

I would use the code that generates a stack trace for leaked memory in Visual Leak Detector. This only works on Win32, though.

Jim Buck
  • 20,482
  • 11
  • 57
  • 74
  • And requires that you ship debug symbols with your code. In general not desirable. Write a mini dump and set up Windows to do it automatically for you on unhandled exceptions. – IInspectable Feb 17 '18 at 14:29
2

In addition to above answers, here how you make Debian Linux OS generate core dump

  1. Create a “coredumps” folder in the user's home folder
  2. Go to /etc/security/limits.conf. Below the ' ' line, type “ soft core unlimited”, and “root soft core unlimited” if enabling core dumps for root, to allow unlimited space for core dumps.
  3. NOTE: “* soft core unlimited” does not cover root, which is why root has to be specified in its own line.
  4. To check these values, log out, log back in, and type “ulimit -a”. “Core file size” should be set to unlimited.
  5. Check the .bashrc files (user, and root if applicable) to make sure that ulimit is not set there. Otherwise, the value above will be overwritten on startup.
  6. Open /etc/sysctl.conf. Enter the following at the bottom: “kernel.core_pattern = /home//coredumps/%e_%t.dump”. (%e will be the process name, and %t will be the system time)
  7. Exit and type “sysctl -p” to load the new configuration Check /proc/sys/kernel/core_pattern and verify that this matches what you just typed in.
  8. Core dumping can be tested by running a process on the command line (“ &”), and then killing it with “kill -11 ”. If core dumping is successful, you will see “(core dumped)” after the segmentation fault indication.
enthusiasticgeek
  • 2,640
  • 46
  • 53
2
gdb -ex 'set confirm off' -ex r -ex bt -ex q <my-program>
Oleksandr Kozlov
  • 697
  • 6
  • 11
1

On Linux/unix/MacOSX use core files (you can enable them with ulimit or compatible system call). On Windows use Microsoft error reporting (you can become a partner and get access to your application crash data).

Kasprzol
  • 4,087
  • 22
  • 20
0

You are probably not going to like this - all I can say in its favour is that it works for me, and I have similar but not identical requirements: I am writing a compiler/transpiler for a 1970's Algol-like language which uses C as it's output and then compiles the C so that as far as the user is concerned, they're generally not aware of C being involved, so although you might call it a transpiler, it's effectively a compiler that uses C as it's intermediate code. The language being compiled has a history of providing good diagnostics and a full backtrace in the original native compilers. I've been able to find gcc compiler flags and libraries etc that allow me to trap most of the runtime errors that the original compilers did (although with one glaring exception - unassigned variable trapping). When a runtime error occurs (eg arithmetic overflow, divide by zero, array index out of bounds, etc) the original compilers output a backtrace to the console listing all variables in the stack frames of every active procedure call. I struggled to get this effect in C, but eventually did so with what can only be described as a hack... When the program is invoked, the wrapper that supplies the C "main" looks at its argv, and if a special option is not present, it restarts itself under gdb with an altered argv containing both gdb options and the 'magic' option string for the program itself. This restarted version then hides those strings from the user's code by restoring the original arguments before calling the main block of the code written in our language. When an error occurs (as long as it is not one explicitly trapped within the program by user code), it exits to gdb which prints the required backtrace.

Keys lines of code in the startup sequence include:

  if ((argc >= 1) && (strcmp(origargv[argc-1], "--restarting-under-gdb")) != 0) {
    // initial invocation
    // the "--restarting-under-gdb" option is how the copy running under gdb knows
    // not to start another gdb process.

and

  char *gdb [] = {
    "/usr/bin/gdb", "-q", "-batch", "-nx", "-nh", "-return-child-result",
    "-ex", "run",
    "-ex", "bt full",
    "--args"
  };

The original arguments are appended to the gdb options above. That should be enough of a hint for you to do something similar for your own system. I did look at other library-supported backtrace options (eg libbacktrace, https://codingrelic.geekhold.com/2010/09/gcc-function-instrumentation.html, etc) but they only output the procedure call stack, not the local variables. However if anyone knows of any cleaner mechanism to get a similar effect, do please let us know. The main downside to this is that the variables are printed in C syntax, not the syntax of the language the user writes in. And (until I add suitable #line directives on every generated line of C :-() the backtrace lists the C source file and line numbers.

G PS The gcc compile options I use are:

 GCCOPTS=" -Wall -Wno-return-type -Wno-comment -g -fsanitize=undefined
 -fsanitize-undefined-trap-on-error -fno-sanitize-recover=all -frecord-gcc-switches
 -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow -ftrapv
 -grecord-gcc-switches -O0 -ggdb3 "
Graham Toal
  • 324
  • 1
  • 7
0

My best async signal safe attempt so far

Let me know if it is not actually safe. I could not yet find a way to show line numbers.

#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>

#define TRACE_MAX 1024

void handler(int sig) {
    (void)sig;
    void *array[TRACE_MAX];
    size_t size;
    const char msg[] = "failed with a signal\n";

    size = backtrace(array, TRACE_MAX);
    write(STDERR_FILENO, msg, sizeof(msg));
    backtrace_symbols_fd(array, size, STDERR_FILENO);
    _Exit(1);
}

void my_func_2(void) {
    *((int*)0) = 1;
}

void my_func_1(double f) {
    (void)f;
    my_func_2();
}

void my_func_1(int i) {
    (void)i;
    my_func_2();
}

int main() {
    /* Make a dummy call to `backtrace` to load libgcc because man backrace says:
     *    *  backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used.  Dynamic loading usually triggers a call to mal‐
     *       loc(3).  If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand.
     */
    void *dummy[1];
    backtrace(dummy, 1);
    signal(SIGSEGV, handler);

    my_func_1(1);
}

Compile and run:

g++ -ggdb3 -O2 -std=c++11 -Wall -Wextra -pedantic -rdynamic -o stacktrace_on_signal_safe.out stacktrace_on_signal_safe.cpp
./stacktrace_on_signal_safe.out

-rdynamic is needed to get the function names:

failed with a signal
./stacktrace_on_signal_safe.out(_Z7handleri+0x6e)[0x56239398928e]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f04b1459520]
./stacktrace_on_signal_safe.out(main+0x38)[0x562393989118]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f04b1440d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f04b1440e40]
./stacktrace_on_signal_safe.out(_start+0x25)[0x562393989155]

We can then pipe it to c++filt to demangle:

./stacktrace_on_signal_safe.out |& c++filt

giving:

failed with a signal
/stacktrace_on_signal_safe.out(handler(int)+0x6e)[0x55b6df43f28e]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f40d4167520]
./stacktrace_on_signal_safe.out(main+0x38)[0x55b6df43f118]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f40d414ed90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f40d414ee40]
./stacktrace_on_signal_safe.out(_start+0x25)[0x55b6df43f155]

Several levels are missing due to optimizations, with -O0 we get a fuller:

/stacktrace_on_signal_safe.out(handler(int)+0x76)[0x55d39b68325f]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f4d8ffdd520]
./stacktrace_on_signal_safe.out(my_func_2()+0xd)[0x55d39b6832bb]
./stacktrace_on_signal_safe.out(my_func_1(int)+0x14)[0x55d39b6832f1]
./stacktrace_on_signal_safe.out(main+0x4a)[0x55d39b68333e]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f4d8ffc4d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f4d8ffc4e40]
./stacktrace_on_signal_safe.out(_start+0x25)[0x55d39b683125]

Line numbers are not present, but we can get them with addr2line. This requires building without -rdynamic:

g++ -ggdb3 -O0 -std=c++23 -Wall -Wextra -pedantic -o stacktrace_on_signal_safe.out stacktrace_on_signal_safe.cpp
./stacktrace_on_signal_safe.out |& sed -r 's/.*\(//;s/\).*//' | addr2line -C -e stacktrace_on_signal_safe.out -f

producing:

??
??:0
handler(int)
/home/ciro/stacktrace_on_signal_safe.cpp:14
??
??:0
my_func_2()
/home/ciro/stacktrace_on_signal_safe.cpp:22
my_func_1(i
/home/ciro/stacktrace_on_signal_safe.cpp:33
main
/home/ciro/stacktrace_on_signal_safe.cpp:45
??
??:0
??
??:0
_start
??:?

awk parses the +<addr> numbers out o the non -rdynamic output:

./stacktrace_on_signal_safe.out(+0x125f)[0x55984828825f]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f8644a1e520]
./stacktrace_on_signal_safe.out(+0x12bb)[0x5598482882bb]
./stacktrace_on_signal_safe.out(+0x12f1)[0x5598482882f1]
./stacktrace_on_signal_safe.out(+0x133e)[0x55984828833e]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f8644a05d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f8644a05e40]
./stacktrace_on_signal_safe.out(+0x1125)[0x559848288125]

If you also want to print the actual signal number to stdout, here's an async signal safe implementation int to string: Print int from signal handler using write or async-safe functions since printf is not.

Tested on Ubuntu 22.04.

C++23 <stacktrace>

Like many other answers, this section ignores async signal safe aspects of the problem, which could lead your code to deadlock on crash, which could be serious. We can only hope one day the C++ standard will add a boost::stacktrace::safe_dump_to-like function to solve this once and for all.

This will be the generally superior C++ stacktrace option moving forward as mentioned at: print call stack in C or C++ as it shows line numbers and does demangling for us automatically.

stacktrace_on_signal.cpp

#include <stacktrace>
#include <iostream>

#include <signal.h>
#include <stdlib.h>
#include <unistd.h>

void handler(int sig) {
    (void)sig;
    /* De-register this signal in the hope of avoiding infinite loops
     * if asyns signal unsafe things fail later on. But can likely still deadlock. */
    signal(sig, SIG_DFL);
    // std::stacktrace::current
    std::cout << std::stacktrace::current();
    // C99 async signal safe version of exit().
    _Exit(1);
}

void my_func_2(void) {
    *((int*)0) = 1;
}

void my_func_1(double f) {
    (void)f;
    my_func_2();
}

void my_func_1(int i) {
    (void)i;
    my_func_2();
}

int main() {
    signal(SIGSEGV, handler);
    my_func_1(1);
}

Compile and run:

g++ -ggdb3 -O2 -std=c++23 -Wall -Wextra -pedantic -o stacktrace_on_signal.out stacktrace_on_signal.cpp -lstdc++_libbacktrace
./stacktrace_on_signal.out

Output on GCC 12.1 compiled from source, Ubuntu 22.04:

   0# handler(int) at /home/ciro/stacktrace_on_signal.cpp:11
   1#      at :0
   2# my_func_2() at /home/ciro/stacktrace_on_signal.cpp:16
   3#      at :0
   4#      at :0
   5#      at :0
   6#

I think it missed my_func_1 due to optimization being turned on, and there is in general nothing we can do about that AFAIK. With -O0 instead it is better:

   0# handler(int) at /home/ciro/stacktrace_on_signal.cpp:11
   1#      at :0
   2# my_func_2() at /home/ciro/stacktrace_on_signal.cpp:16
   3# my_func_1(int) at /home/ciro/stacktrace_on_signal.cpp:26
   4#      at /home/ciro/stacktrace_on_signal.cpp:31
   5#      at :0
   6#      at :0
   7#      at :0
   8#

but not sure why main didn't show up there.

backtrace_simple

https://github.com/gcc-mirror/gcc/blob/releases/gcc-12.1.0/libstdc%2B%2B-v3/src/libbacktrace/backtrace-supported.h.in#L45 mentions that backtrace_simple is safe:

/* BACKTRACE_USES_MALLOC will be #define'd as 1 if the backtrace
   library will call malloc as it works, 0 if it will call mmap
   instead.  This may be used to determine whether it is safe to call
   the backtrace functions from a signal handler.  In general this
   only applies to calls like backtrace and backtrace_pcinfo.  It does
   not apply to backtrace_simple, which never calls malloc.  It does
   not apply to backtrace_print, which always calls fprintf and
   therefore malloc.  */

but it does not appear very convenient for usage, mostly an internal tool.

std::basic_stacktrace

This is what std::stacktrace is based on according to: https://en.cppreference.com/w/cpp/utility/basic_stacktrace

It has an allocator parameter which cppreference describes as:

Support for custom allocators is provided for using basic_stacktrace on a hot path or in embedded environments. Users can allocate stacktrace_entry objects on the stack or in some other place, where appropriate.

so I wonder if basic_stacktrace is itself async signal safe, and if it wouldn't be possible to make a version of std::stacktrace that is also with a custom allocator, e.g. either something that:

  • writes to a file on disk like boost::stacktrace::safe_dump_to
  • or writes to some pre-alocated stack buffer with some maximum size

https://apolukhin.github.io/papers/stacktrace_r1.html might be the proposal that got in, mentions:

Note about signal safety: this proposal does not attempt to provide a signal-safe solution for capturing and decoding stacktraces. Such functionality currently is not implementable on some of the popular platforms. However, the paper attempts to provide extensible solution, that may be made signal safe some day by providing a signal safe allocator and changing the stacktrace implementation details.

Just getting the core dump instead?

The core dump allows you to inspect memory with GDB: How do I analyze a program's core dump file with GDB when it has command-line parameters? so it is more powerful than just having the trace.

Just make sure you enable it properly, notably on Ubuntu 22.04 you need:

echo 'core' | sudo tee /proc/sys/kernel/core_pattern

or to learn to use apport, see also: https://askubuntu.com/questions/1349047/where-do-i-find-core-dump-files-and-how-do-i-view-and-analyze-the-backtrace-st/1442665#1442665

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
0

I forgot about the GNOME tech of "apport", but I don't know much about using it. It is used to generate stacktraces and other diagnostics for processing and can automatically file bugs. It's certainly worth checking in to.