2

I'm trying to restart the program when segmention fault occures.

I have following minimal reproducible code:-

#include <csignal>
#include <unistd.h>
#include <iostream>

int app();

void ouch(int sig) {
    std::cout << "got signal " << sig << std::endl;
    exit(app());
}

struct L { int l; };
static int i = 0;

int app() {
    L *l= nullptr;
    while(1) {
        std::cout << ++i << std::endl;
        sleep(1);
        std::cout << l->l << std::endl; //crash
        std::cout << "Ok" << std::endl;
    }
}

int main() {
    struct sigaction act;
    act.sa_handler = ouch;
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;
    sigaction(SIGKILL, &act, 0);
    sigaction(SIGSEGV, &act, 0);
    return app();
}

It successfully catches sigsegv first time but after it prints 2, it shows me segmentation fault (core dumped)

1
got signal 11
2
zsh: segmentation fault (core dumped)  ./a.out

tested with clang 12.0.1 and gcc 11.1.0 on ArchLinux

Is this operating system specific behavior or is something wrong in my code

LightSith
  • 795
  • 12
  • 27
  • 1
    It seems you don't understand for what signal handlers are used. If a signal is raised from a signal handler, it's bad. – 273K Sep 21 '21 at 18:23
  • 1
    Possible remedy in [Why can't I ignore SIGSEGV signal?](https://stackoverflow.com/a/8456519/7582247) – Ted Lyngmo Sep 21 '21 at 18:28
  • 4
    Trying to recover from a segfault is almost always a very bad idea. It isn't designed to be a recoverable error, it usually indicates a logic error which is something that can't be fixed at runtime. – François Andrieux Sep 21 '21 at 18:40
  • 1
    You should really think the other way around and be grateful for the segfault. It means your program has a bug in it that needs to be solved. Don't add new code until you've solved the root cause of this one. – Pepijn Kramer Sep 21 '21 at 19:05
  • 1
    You can `siglongjmp` to try to get back to some known point in the program, but global state may be messed up, and you may also leak memory or other resources. Or you can `exec()` to start completely from scratch. – Nate Eldredge Sep 21 '21 at 19:17

1 Answers1

6

The problem is that when you restart the program by calling exit(app()) from inside ouch(), you are still technically inside the signal handler. The signal handler is blocked until you return from it. Since you never return, you therefore cannot catch a second SIGSEGV.

If you got a SIGSEGV, then something really bad has happened, and there is no guarantee that you can just "restart" the process by calling app() again. The best solution to handle this is to have another program start your program, and restart it if it crashed. See this ServerFault question for some suggestions of how to handle this.

user4581301
  • 33,082
  • 7
  • 33
  • 54
G. Sliepen
  • 7,637
  • 1
  • 15
  • 31