Linux: handling a segmentation fault and getting a core dump

Question

When my application crashes with a segmentation fault I'd like to get a core dump from the system. I do that by configuring before hand

ulimit -c unlimited

I would also like to have an indication in my application logs that a segmentation fault has occured. I do that by using sigaction(). If I do that however, the signal does not reach its default handling and a core dump is not saved.

How can I have both the system core dump an a log line from my own signal handler at the same time?

score 12 · Answer 1 · answered Oct 11 '13 at 00:48

Overwrite the default signal handler for SIGSEGV to call your custom logging function.
After it is logged, restore and trigger the default handler that will create the core dump.

Here is a sample program using signal:

void sighandler(int signum)
{
  myLoggingFunction();

  // this is the trick: it will trigger the core dump
  signal(signum, SIG_DFL);
  kill(getpid(), signum);
}

int main()
{
   signal(SIGSEGV, sighandler);

   // ...
}

The same idea should also work with sigaction.

Source: How to handle SIGSEGV, but also generate a core dump

What did works for me was to set the signal to signal(signum, SIG_DFL); and let the signal handler return. — Vincent, Nov 21 '16 at 22:35

score 3 · Accepted Answer · answered May 23 '13 at 15:19

3

The answer: set the sigaction with flag SA_RESETHAND and just return from the handler. The same instruction occurs again, causing a segmentation fault again and invoking the default handler.

answered May 23 '13 at 15:19

shoosh

76,898
55
205
325

1

This doesn't work on the version of Redhat 6 I was testing on, and causes a regressive loop where the handler is not reset. It works if you store the old handler when calling sigaction, and resetting it in the SIGSEGV handler explicitly. – phenompbg Apr 11 '16 at 10:08

score 0 · Answer 3 · answered Dec 02 '22 at 09:06

There's no need to do anything special in your signal handler

As explained at: Where does signal handler return back to? by default the program returns to the very instruction that caused the SIGSEGV after a signal gets handled.

Furthermore, tested as of Ubuntu 22.04, the default behavior for signal is that it automatically de-registers the handler. man signal does suggest that this is not very portable however, so maybe using the more explicit sigaction syscall intead is better.

Therefore, what happens by default on that system is:

the signal gets handled
the handler is automatically disabled
after return, you go back to the instruction that causes the signal
signal happens again
there is no handler, so crash in basically the exact same way as if we hadn't handled the signal

The most important thing to check is if you can generate core dumps at all regardless of the signal handler. Notably, many newer systems such as Ubuntu 22.04 have a complex core dump handler which prevents creation of core files: https://askubuntu.com/questions/1349047/where-do-i-find-core-dump-files-and-how-do-i-view-and-analyze-the-backtrace-st/1442665#1442665 and which you can deactivate as a one off with:

echo 'core' | sudo tee /proc/sys/kernel/core_pattern

Minimal runnable example:

main.c

#include <signal.h> /* signal, SIGSEGV */

#include <unistd.h> /* write, STDOUT_FILENO */

void signal_handler(int sig) {
    (void)sig;
    const char msg[] = "signal received\n";
    write(STDOUT_FILENO, msg, sizeof(msg));
}

int myfunc(int i) {
    *(int *)0 = 1;
    return i + 1;
}

int main(int argc, char **argv) {
    (void)argv;
    signal(SIGSEGV, signal_handler);
    int ret = myfunc(argc);
    return ret;
}

compile and run:

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out

Terminal output contain:

signal received
Segmentation fault (core dumped)

so we see that the signal was both handled, and we got a core file.

And inspecting the core file with:

gdb main.out core.243260

does put us at the correct line:

#0  myfunc (i=1) at main.c:12
12          *(int *)0 = 1;

so we did return to it as expected.

Making it more portable with sigaction

man signal portability section has a Bible of a text about how signal() varied across different OSes and versions:

The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a signal handler is invoked; use that interface instead of signal().

In the original UNIX systems, when a handler that was established using signal() was invoked by the delivery of a signal, the disposition of the signal would be reset to SIG_DFL, and the system did not block delivery of further instances of the signal. This is equivalent to calling sigaction(2) with the following flags:
sa.sa_flags = SA_RESETHAND | SA_NODEFER;
System V also provides these semantics for signal(). This was bad because the signal might be delivered again before the handler had a chance to reestablish itself. Furthermore, rapid deliveries of the same signal could result in recursive invocations of the handler.

BSD improved on this situation, but unfortunately also changed the semantics of the existing signal() interface while doing so. On BSD, when a signal handler is invoked, the signal disposition is not reset, and further instances of the signal are blocked from being delivered while the handler is executing. Furthermore, certain blocking system calls are automatically restarted if interrupted by a signal handler (see signal(7)). The BSD semantics are equivalent to calling sigaction(2) with the following flags:
sa.sa_flags = SA_RESTART;
The situation on Linux is as follows:

The kernel's signal() system call provides System V semantics.

By default, in glibc 2 and later, the signal() wrapper function does not invoke the kernel system call. Instead, it calls sigaction(2) using flags that supply BSD semantics. This default behavior is provided as long as a suitable feature test macro is defined: _BSD_SOURCE on glibc 2.19 and earlier or _DEFAULT_SOURCE in glibc 2.19 and later. (By default, these macros are defined; see feature_test_macros(7) for details.) If such a feature test macro is not defined, then signal() provides System V semantics.

That seems to suggest that I should get BSD semantics by default, but I seem to get System V semantics for some reason because:

sudo strace -f -s999 -v ./main.out

contains:

rt_sigaction(SIGSEGV, {sa_handler=0x55b428604189, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT|SA_NODEFER|SA_RESETHAND|0xffffffff00000000, sa_restorer=0x7fb173d0a520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

which has the SA_NODEFER|SA_RESETHAND. Notably, the flag we care the most about is SA_RESETHAND, which resets the handler to the default behavior.

But maybe I just misinterpreted one of the verses of the Holy Text.

So, just to be more portable, we could do the same as above with sigaction instead:

sigaction.c

#define _XOPEN_SOURCE 700
#include <signal.h> /* signal, SIGSEGV */

#include <unistd.h> /* write, STDOUT_FILENO */

void signal_handler(int sig) {
    (void)sig;
    const char msg[] = "signal received\n";
    write(STDOUT_FILENO, msg, sizeof(msg));
}

int myfunc(int i) {
    *(int *)0 = 1;
    return i + 1;
}

int main(int argc, char **argv) {
    (void)argv;
    /* Adapted from: https://www.gnu.org/software/libc/manual/html_node/Sigaction-Function-Example.html */
    struct sigaction new_action;
    new_action.sa_handler = signal_handler;
    sigemptyset(&new_action.sa_mask);
    new_action.sa_flags = SA_NODEFER|SA_RESETHAND;
    sigaction(SIGINT, &new_action, NULL);
    int ret = myfunc(argc);
    return ret;
}

which behaves just like main.c in Ubuntu 22.04.

Linux: handling a segmentation fault and getting a core dump

3 Answers3

Linked