Coming back to life after Segmentation Violation

Question

Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?

struct A {
    int x;
};
A* a = 0;

a->x = 123; // this is where segmentation violation occurs

// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....

I want a mechanism similar to NullPointerException that is present in Java, C# etc.

Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.

What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?

Where do you want to return control to after you got a "nullpointer exception" ? — nos, Jul 20 '10 at 15:27
it would be best to return control to the first instruction that follows the one that caused the sigsegv, but any other "safe" place is appreciated. — Marc Andreson, Jul 20 '10 at 15:39
There is **no** "normal execution flow" after SIGSEGV. The fault might have happened in something like `if (*p > 12)` and since *p is undefined, your program will continue to run as if you had written `if (random() & 1 == 0)` — zvrba, Jul 20 '10 at 15:57
For those of you who says my idea is illogical. The situation is as follows: I've got a loop in which I execute many instructions nested in functions that I cannot rewrite. After each loop iteration my program must print the result. If the error occurs, I write the default result rather than the one computed by the functions. For me it makes sense. — Marc Andreson, Jul 20 '10 at 16:05
Why would you make such an error happen? Error checking, API return value etc can help avoid these things. — Praveen S, Jul 20 '10 at 16:17
Make sure you're passing valid input to those functions and then they won't write to null pointers unless they have serious bugs... — R.. GitHub STOP HELPING ICE, Jul 20 '10 at 16:17
It really does not make sense at all trying to accomplish this in C. — nos, Jul 20 '10 at 16:19
they do have serious bugs and are compiled (no source code) ;] and thus are error-prone — Marc Andreson, Jul 20 '10 at 16:19
You might look into structured exception handling (SEH) on Windows, it allows you to catch hardware faults. POSIX has no equivalent facility. (signal handlers are valid for the whole program, whereas SEH is restricted to program blocks where you use it.) — zvrba, Jul 20 '10 at 16:22
@Marc, the reason you shouldn't do this is that you have literally *no idea* what the state of your program is after a segfault. Your stack may be corrupted; your heap may be corrupted; your global variables may be squashed; your instruction pointer may be in neverland; your secure data may be flying over the internets; your CPU might be on fire. You will regret trying to continue after a segfault as if nothing has happened. The right way to handle calling buggy third-party code is to spawn a new process to call it--a lot more work, but the only thing remotely reliable. — JSBձոգչ, Jul 20 '10 at 18:49
I guess you are hoping the error inside the "functions you can't rewrite" is of little or none consequences to your own code. The problem with that reasoning is that, as you said, you don't have the sources. For all you know, they could be writing inside your own allocated memory, erasing your own data or even your stack. If this happen, even using your own handlers won't get back your process memory to its correct state. Thus, ignoring the segmentation fault produced by **their** code will just delay until the next one raised from **your** code... I'm so happy I'm not in your situation... — paercebal, Jul 20 '10 at 20:10

nos · Accepted Answer · 2010-07-20T15:50:31.947

#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>

void safe_func(void)
{
    puts("Safe now ?");
    exit(0); //can't return to main, it's where the segfault occured.
}

void
handler (int cause, siginfo_t * info, void *uap)
{
  //For test. Never ever call stdio functions in a signal handler otherwise*/
  printf ("SIGSEGV raised at address %p\n", info->si_addr);
  ucontext_t *context = uap;
  /*On my particular system, compiled with gcc -O2, the offending instruction
  generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
  pointer to the next instruction (general register 14 is EIP, on linux x86) */
  context->uc_mcontext.gregs[14] += 6; 
  //alternativly, try to jump to a "safe place"
  //context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}

int
main (int argc, char *argv[])
{
  struct sigaction sa;
  sa.sa_sigaction = handler;
  int *f = NULL;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO;
  if (sigaction (SIGSEGV, &sa, 0)) {
      perror ("sigaction");
      exit(1);
  }
  //cause a segfault
  *f = 16; 
  puts("Still Alive");
  return 0;
}

$ ./a.out
SIGSEGV raised at address (nil)
Still Alive

I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.

You should modify either EIP (x86 32bit) or RIP register (x86_64). They are located at different offsets in the uc_mcontext.gregs[]. So instead of gregs[14] use gregs[REG_RIP] if under #ifdef __WORDSIZE == 64, otherwise use gregs[REG_EIP] — Alexey Polonsky, May 11 '15 at 13:09
At least my GCC compiles the code: //cause a segfault *f = 16; puts("Still Alive"); into the following assembly: movl $0, 0 ud2 So the compiler assumes, that the null pointer write will go bad and will never return. It even optimized out the "puts" thereafter completely! That can be fixed, by replacing: int *f = NULL; with a function call, that (unpredictable by the compiler) returns a null pointer. For example, use: int *f = (int *) fopen("/nonexisting/path", "bad-mode"); — Kai Petzke, Jan 04 '18 at 13:49

score 7 · Answer 2 · answered Jul 20 '10 at 15:09

7

You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).

The signal name is SIGSEGV.

You will have to use the sigaction() function, from the signal.h header.

Basically, it works the following way:

struct sigaction sa1;
struct sigaction sa2;

sa1.sa_handler = your_handler_func;
sa1.sa_flags   = 0;
sigemptyset( &sa1.sa_mask );

sigaction( SIGSEGV, &sa1, &sa2 );

Here's the prototype of the handler function:

void your_handler_func( int id );

As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.

answered Jul 20 '10 at 15:09

Macmade

52,708
13
106
123

I know, but how should it look like? simply: void my_handler() { return; } ?? – Marc Andreson Jul 20 '10 at 15:10
17

Simply returning will not help; the same instruction will get executed again and crash again. A more reliable way to do things (but still ugly and not recommended!) is to `longjmp` (or better yet `siglongjump`) out of the signal handler to a known-safe location. – R.. GitHub STOP HELPING ICE Jul 20 '10 at 15:19
@R. I thought whether the same insn or the next insn will get executed was CPU+OS dependant? At least, it's that way for SIGFPE IIRC. – ninjalj Jul 20 '10 at 15:24
@ninjalj, It's waaaaay out there in the realm of undefined behavior, but I was speaking of how it's done in practice. – R.. GitHub STOP HELPING ICE Jul 20 '10 at 16:01

score 3 · Answer 3 · answered Jul 20 '10 at 15:13

3

"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.

answered Jul 20 '10 at 15:13

corsiKa

81,495
25
153
204

1

While this is true in general, there are some special cases where hacks for handling SIGSEGV are arguably worthwhile. One example I can think of is when an extremely tight, performance-critical loop is using numbers pulled from a potentially-untrusted source to be used as write indices and can't afford to do bounds-checking. As long as you can bound the range of potential out-of-bounds writes, you could `mmap` the data and map read-only pages adjacent to it, and let the cpu catch and report out-of-bound writes for you. Yes it's ugly but I know MPlayer's libmpeg2 variant once did this. :-) – R.. GitHub STOP HELPING ICE Jul 20 '10 at 16:05
I'll certainly concede that point, but you're right, it is ugly! And if there's a bug it will be painful to fix. – corsiKa Jul 20 '10 at 16:23

ninjalj · Answer 4 · 2010-07-20T16:13:53.793

See R.'s comment to MacMade answer.

Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:

#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>

static jmp_buf  context;

static void sig_handler(int signo)
{
    /* XXX: don't do this, not reentrant */
    printf("Got SIGFPE\n");

    /* avoid infinite loop */
    longjmp(context, 1);
}

int main()
{
    int a;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sa.sa_handler = sig_handler;
    sa.sa_flags = SA_RESTART;
    sigaction(SIGFPE, &sa, NULL);

    if (setjmp(context)) {
            /* If this one was on setjmp's block,
             * it would need to be volatile, to
             * make sure the compiler reloads it.
             */
            sigset_t ss;

            /* Make sure to unblock SIGFPE, according to POSIX it
             * gets blocked when calling its signal handler.
             * sigsetjmp()/siglongjmp would make this unnecessary.
             */
            sigemptyset(&ss);
            sigaddset(&ss, SIGFPE);
            sigprocmask(SIG_UNBLOCK, &ss, NULL);

            goto skip;
    }

    a = 10 / 0;
skip:
    printf("Exiting\n");

    return 0;
}

`sigemptyset`, `sigprocmask`, etc. are not ISO C either. As soon as you get into fancy signal tricks you're way outside the realm of plain C and into POSIX so you might as well go ahead and use `sigsetjmp` and save yourself the trouble. — R.. GitHub STOP HELPING ICE, Jul 20 '10 at 16:08

JeremyP · Answer 5 · 2010-07-20T16:23:53.180

3

No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.

Consider some of the possible causes of a segmentation fault:

you forgot to assign a legitimate value to a pointer
a pointer has been overwritten possibly because you are accessing heap memory you have freed
a bug has corrupted the heap
a bug has corrupted the stack
a malicious third party is attempting a buffer overflow exploit
malloc returned null because you have run out of memory

Only in the first case is there any kind of reasonable expectation that you might be able to carry on

If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.

Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:

void foobarMyProcess(struct SomeStruct* structPtr)
{
    char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
    //
    // if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
    // where the stack frame was created
    //
    strcpy(aBuffer, "Some longish string");  // You've just written the string to some random location in your address space
                                             // good luck with that!

}

edited Jul 20 '10 at 16:23

answered Jul 20 '10 at 15:53

JeremyP

84,577
15
123
161

+1 for calling the question out on wanting to do the wrong thing. :-) – R.. GitHub STOP HELPING ICE Jul 20 '10 at 16:09
thanks for the example, but will this copy operation be performed? I claim the OS will not allow this and send SIGSEGV instead. what is more, assignment is not the only operation that violates memory. reading memory does not cause such threat. – Marc Andreson Jul 20 '10 at 16:38
@MarcAnderson: that is not true. Read operations can also cause SIGSEGV. – Michael Foukarakis Jul 20 '10 at 19:02
did I say they don't cause SIGSEGV? The threat I mentioned applies to overwriting invalid memory location. – Marc Andreson Jul 20 '10 at 19:07
@MarcAnderson: the copy operation may or may not be performed. aBuffer will contain a random value after recovering from the first SIGSEGV. If that random value points to protected memory e.g. is 0 or points into a read only segment, it will raise another SIGSEGV. If, however it happens to point into the stack, you'll overwrite a stack frame or two or if it points into the heap, you'll corrupt the heap. – JeremyP Jul 21 '10 at 07:33
In fact, the worst case scenario is if it points into the middle of some data e.g. the text of a document, in which case it will just alter that data without necessarily flagging any kind of error. – JeremyP Jul 21 '10 at 07:35

score 1 · Answer 6 · answered Jul 20 '10 at 15:11

1

Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.

void segv_handler(int)
{
  // Do what you want here
}

signal(SIGSEGV, segv_handler);

answered Jul 20 '10 at 15:11

Scharron

17,233
6
44
63

can i leave the body empty? will it return back to the place where error occured? – Marc Andreson Jul 20 '10 at 15:12
I just edited to add comments. Yes it can be empty, and will return to where the exception occured. – Scharron Jul 20 '10 at 15:13
6

It will continue back where it was, which is the instruction causing the segfault, not *after* the cause of the segfault. So the segfault will occur again, and your handler will be called again, and so on. – nos Jul 20 '10 at 15:16
1

oh yes ... Thus, don't segfault and it should be ok ;-) – Scharron Jul 20 '10 at 15:29

score 1 · Answer 7 · answered Jul 20 '10 at 19:09

There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.

Just do yourself a favour and take error cases into account.

score 0 · Answer 8 · answered Jul 16 '13 at 17:30

Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.

What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.

You could also rewrite the functions yourself.

newlogic · Answer 9 · 2014-07-21T19:51:20.037

I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.

score 0 · Answer 10 · answered Jul 20 '10 at 15:11

0

In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.

answered Jul 20 '10 at 15:11

Carl Norum

219,201
40
422
469

score 0 · Answer 11 · answered Jul 20 '10 at 18:52

You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)? Everything is possible, but it doesn't mean that it has to be done.

score 0 · Answer 12 · answered Apr 09 '16 at 11:52

the best solution is to inbox each unsafe access this way :

#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
     longjmp(buf,0);
}
int main()
{
    signal(SIGSEGV,signal_handler);
    setjmp(buf);
    if(counter++ == 0){ // if we did'nt try before
    *(int*)(0x1215) = 10;  // access an other process's memory
    }
    std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
    system("pause");
 return 0;   
}

you program will never crash in almost all os

Praveen S · Answer 13 · 2010-07-20T15:35:57.390

This glib manual gives you a clear picture of how to write signal handlers.

A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal 
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.

In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.

Signal handlers are broadly classified into tow categories

You can have the handler function note that the signal arrived by tweaking some global data structures, and then return normally.
You can have the handler function terminate the program or transfer control to a point where it can recover from the situation that caused the signal.

SIGSEGV comes under program error signals

Coming back to life after Segmentation Violation

13 Answers13

Linked