25

Here is my code,

#include<signal.h>
#include<stdio.h>

int main(int argc,char ** argv)
   {
     char *p=NULL;
     signal(SIGSEGV,SIG_IGN); //Ignoring the Signal
     printf("%d",*p);
     printf("Stack Overflow"); //This has to be printed. Right?
   return 0;
    }

While executing the code, i'm getting segmentation fault. I ignored the signal using SIG_IGN. So I shouldn't get Segmentation fault. Right? Then, the printf() statement after printing '*p' value must executed too. Right?

Dinesh
  • 16,014
  • 23
  • 80
  • 122
  • 16
    There will be a time in which writing code that swallows segfaults will be be considered enough to put the programmer in jail. – 6502 Dec 10 '11 at 11:24
  • Related: [Why is a segmentation fault not recoverable?](https://stackoverflow.com/q/70258418) - unless you crash on purpose in a simple way, and know how your compiler generates that asm, you can't reliably catch SIGSEGV and continue. Buggy code might have corrupted other variables before getting to a load or store that actually touches unmapped memory. – Peter Cordes Nov 04 '22 at 20:20

4 Answers4

28

Your code is ignoring SIGSEGV instead of catching it. Recall that the instruction that triggered the signal is restarted after handling the signal. In your case, handling the signal didn't change anything so the next time round the offending instruction is tried, it fails the same way.

If you intend to catch the signal change this

signal(SIGSEGV, SIG_IGN);

to this

signal(SIGSEGV, sighandler);

You should probably also use sigaction() instead of signal(). See relevant man pages.

In your case the offending instruction is the one which tries to dereference the NULL pointer.

printf("%d", *p);

What follows is entirely dependent on your platform.

You can use gdb to establish what particular assembly instruction triggers the signal. If your platform is anything like mine, you'll find the instruction is

movl    (%rax), %esi

with rax register holding value 0, i.e. NULL. One (non-portable!) way to fix this in your signal handler is to use the third argument signal your handler gets, i.e. the user context. Here is an example:

#include <signal.h>
#include <stdio.h>

#define __USE_GNU
#include <ucontext.h>

int *p = NULL;
int n = 100;

void sighandler(int signo, siginfo_t *si, ucontext_t* context)
{
  printf("Handler executed for signal %d\n", signo);
  context->uc_mcontext.gregs[REG_RAX] = &n;
}

int main(int argc,char ** argv)
{
  signal(SIGSEGV, sighandler);
  printf("%d\n", *p); // ... movl (%rax), %esi ...
  return 0;
}

This program displays:

Handler executed for signal 11
100

It first causes the handler to be executed by attempting to dereference a NULL address. Then the handler fixes the issue by setting rax to the address of variable n. Once the handler returns the system retries the offending instruction and this time succeeds. printf() receives 100 as its second argument.

I strongly recommend against using such non-portable solutions in your programs, though.

Adam Zalcman
  • 26,643
  • 4
  • 71
  • 92
15

You can ignore the signal but you have to do something about it. I believe what you are doing in the code posted (ignoring SIGSEGV via SIG_IGN) won't work at all for reasons which will become obvious after reading the bold bullet.

When you do something that causes the kernel to send you a SIGSEGV:

  • If you don't have a signal handler, the kernel kills the process and that's that
  • If you do have a signal handler
    • Your handler gets called
    • The kernel restarts the offending operation

So if you don't do anything abut it, it will just loop continuously. If you do catch SIGSEGV and you don't exit, thereby interfering with the normal flow, you must:

  • fix things such that the offending operation doesn't restart or
  • fix the memory layout such that what was offending will be ok on the next run
cnicutar
  • 178,505
  • 25
  • 365
  • 392
  • 1
    Ok, So what do I have to do exactly? – Dinesh Dec 10 '11 at 11:14
  • @Dinesh, what is it that you are trying to accomplish? – ibid Dec 10 '11 at 11:15
  • @ibid, In code, I'm trying to access a memory null memory. SO it leads to creation of SIGSEGV signal. But I made a handler for that which will just print "Catching the signal". Afterwards the statement "return 0" which resides in main() must execute. Right? – Dinesh Dec 10 '11 at 11:18
  • @Dinesh Currently the code you're posting simply ignores the signal, it doesn't establish a signal handler (`sighandler`) for it. Even if that were the case, I suspect it would continuously print "catching signal". – cnicutar Dec 10 '11 at 11:22
  • @cnicutar, Oh, Then can you tell me how to add sighandler for that? Can you give me any link where I can get to know about that – Dinesh Dec 10 '11 at 11:26
  • @Dinesh Just set `sighandler` as the handler. But for the fixing part, there's no general way to fix a segfault. Just a few `mmap` tricks. – cnicutar Dec 10 '11 at 11:30
  • 1
    My question is why doesn't the program just go into an infinite loop? Is SIGSEGV special, in that ignoring it is the same as setting the default action (kill process and do a core dump)? – Bogatyr Jan 23 '21 at 09:18
  • @Bogatyr: Yeah, seems to be. Single-stepping in GDB, the very first single-step of the load from address `0` raises SIGSEGV (using `stepi` to step by asm instructions). Whether that involved two page-faults or not IDK (e.g. return to user-space to retry even after an invalid page fault, then raise SIGSEGV after the second #PF hardware exception), but I'd guess not. `strace` says `rt_sigaction` to ignore the signal returned `0`, which means success. The man page says it can return EINVAL for signals that can't be ignored. – Peter Cordes Nov 04 '22 at 20:09
  • Anyway, this answer is right about what would hypothetically happen if you really could ignore SIGSEGV, but wrong about what happens if you try to do that in practice. – Peter Cordes Nov 04 '22 at 20:10
11

Another option is to bracket the risky operation with setjmp/longjmp, i.e.

#include <setjmp.h>
#include <signal.h>

static jmp_buf jbuf;
static void catch_segv()
{
    longjmp(jbuf, 1);
}

int main()
{
    int *p = NULL;

    signal(SIGSEGV, catch_segv);
    if (setjmp(jbuf) == 0) {
        printf("%d\n", *p);
    } else {
        printf("Ouch! I crashed!\n");
    }
    return 0;
}

The setjmp/longjmp pattern here is similar to a try/catch block. It's very risky though, and won't save you if your risky function overruns the stack, or allocates resources but crashes before they're freed. Better to check your pointers and not indirect through bad ones.

  • 5
    As far as I can tell this doesn't work if you run into the segfault multiple times (the second time the process still segfaults). AFAIU `longjmp`/`setjmp` doesn't handle the signal context properly, and `sigsetjmp`/`siglongjmp` should be used instead. Cf "Notes" in https://linux.die.net/man/2/setcontext – mortenpi Apr 20 '17 at 02:55
2

Trying to ignore or handle a SIGSEGV is the wrong approach. A SIGSEGV triggered by your program always indicates a bug. Either in your code or code you delegate to. Once you have a bug triggered, anything could happen. There is no reasonable "clean-up" or fix action the signal handler can perform, because it can not know where the signal was triggered or what action to perform. The best you can do is to let the program fail fast, so a programmer will have a chance to debug it when it is still in the immediate failure state, rather than have it (probably) fail later when the cause of the failure has been obscured. And you can cause the program to fail fast by not trying to ignore or handle the signal.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
  • 2
    It's not *always* a bug. Some JVMs or Javascript engines will sometimes put the end of an array at the end of a page that's followed by an unmapped page, offloading bounds checking to the hardware. SIGSEGV means the guest Java or Javascript code took an array out-of-bounds exception, not that the JVM itself is buggy. But yes, outside of planned cases like this, it's a bug and should not be ignored. – Peter Cordes May 20 '19 at 20:06
  • Programs can also specifically mprotect() a range of memory and use a SIGSEGV handler to know that an address within the range was accessed. – Bogatyr Jan 22 '21 at 18:43