12

Let's say I have a code that causes segmentation fault.

char * ptr = NULL;
*ptr = "hello"; /* this will cause a segmentation fault */

How do I print on runtime, the address in memory that the segmentation fault occurred at, and the reason for the segmentation fault (accessing to forbidden memory region, or something else).

I read about core dump files, but I'm not sure if it's the correct solution.

How can I do this?

P.S, I'm aware to do fact that I can achieve that by using gdb, or an other debugger, but the purpose is to do this by using code, and only code.

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
Lior
  • 5,841
  • 9
  • 32
  • 46
  • 3
    You could use the [`backtrace`](http://linux.die.net/man/3/backtrace) function. But I really recommend you to run your program in a debugger instead, it will allow you to not only see the backtrace, but walk up the call-stack and examine variables. – Some programmer dude Apr 15 '13 at 12:37
  • 2
    "read about core dump files" - I'd strongly recommend them. They dump everything in the memory and you can then open them with `gdb` and the correct executable. This will give you the chance to see what exactly happened (unless the memory is not messed up, but that's pretty rare case) - see any variables' values, backtrace, threads, etc (of course, it would be nice to have max debug level and no optimizations for this type of investigation) – Kiril Kirov Apr 15 '13 at 12:41
  • hmm.. the type of `*ptr` is `char`, but `"hello"`'s type is `char*`. you should probably assign a character (`*ptr = 'h';`) or use a `memmove()` or similar for the example to be correct. as it is, it takes the address of the string constant, casts it to integer, shaves it down to 1 byte, and then segfaults assigning it to `*ptr` – SingleNegationElimination Apr 15 '13 at 12:41
  • C/C++ is not like Java so you'll have to use gdb and analyze the dumped core. –  Apr 15 '13 at 12:47
  • May be we can do something in the catcher of `SIGSEGV` signal... – Sam Apr 15 '13 at 12:51
  • 7
    For the Love of God, please don't. What added value value do you expect? Whatever you do will be much more limited compared to a coredump and a debugger. Besides, trying to "handle" such errors (and if only for diagnostics output) is a good way to get your process into even deeper trouble, likely even hiding the real error (seen that in production apps a couple of times). – Christian.K Apr 15 '13 at 12:52
  • @SAM Catching `SIGSEGV` is really tricky. When a segmentation fault occurs your process' is _very_ likely already garbled (for example heap manager internal data structures, etc.) and whatever you attempt to do is likely making the situation worse. – Christian.K Apr 15 '13 at 12:55
  • While you might not be able to do this inside the code you can usually install a wrapper outside the program to check for SEGV. – Antti Rytsölä Apr 15 '13 at 12:57

2 Answers2

5

If you want to know the cause you can register a signal handler, something like:

void handler(int signum, siginfo_t *info, void *context)
{
  struct sigaction action = {
    .sa_handler = SIG_DFL,
    .sa_sigaction = NULL,
    .sa_mask = 0,
    .sa_flags = 0,
    .sa_restorer = NULL
  };

  fprintf(stderr, "Fault address: %p\n", info->si_addr);
  switch (info->si_code) {
  case SEGV_MAPERR:
    fprintf(stderr, "Address not mapped.\n");
    break;

  case SEGV_ACCERR:
    fprintf(stderr, "Access to this address is not allowed.\n");
    break;

  default:
    fprintf(stderr, "Unknown reason.\n");
    break;
  }

  /* unregister and let the default action occur */
  sigaction(SIGSEGV, &action, NULL);
}

And then somewhere you need to register it:

  struct sigaction action = {
    .sa_handler = NULL,
    .sa_sigaction = handler,
    .sa_mask = 0,
    .sa_flags = SA_SIGINFO,
    .sa_restorer = NULL
  };


  if (sigaction(SIGSEGV, &action, NULL) < 0) {
    perror("sigaction");
  }

Basically you register a signal that fires when SIGSEGV is delivered, and you get some additional info, to quote the man page:

   The following values can be placed in si_code for a SIGSEGV signal:

       SEGV_MAPERR    address not mapped to object

       SEGV_ACCERR    invalid permissions for mapped object

These map to the two basic reasons for getting a seg fault -- either the page you accessed wasn't mapped at all, or you weren't allowed to perform whatever operation you attempted to that page.

Here after the signal handler fires it unregisters itself and replaces the default action. This causes the operation that failed to be performed again so it can be caught by the normal route. This is the normal behavior of a page fault (the precursor to getting a seg fault) so that things like demand paging work.

FatalError
  • 52,695
  • 14
  • 99
  • 116
2

As already answered here: How to generate a stacktrace when my gcc C++ app crashes

You can (in the case of GCC with Linux/BSD at least) do this fairly easy:

Example code:

#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>


void handler(int sig) {
  void *array[10];
  size_t size;

  // get void*'s for all entries on the stack
  size = backtrace(array, 10);

  // print out all the frames to stderr
  fprintf(stderr, "Error: signal %d:\n", sig);
  backtrace_symbols_fd(array, size, 2);
  exit(1);
}

int main(int argc, char **argv) {
  signal(SIGSEGV, handler);   // install our handler

  char * ptr = NULL;
  *ptr = "hello"; /* this will cause a segmentation fault */
}

Example output:

# gcc -g -rdynamic -o test test.c
# ./test
Error: signal 11:
0   test                                0x000000010e99dcfa handler + 42
1   libsystem_c.dylib                   0x00007fff95c1194a _sigtramp + 26
2   ???                                 0x0000000000000000 0x0 + 0
3   libdyld.dylib                       0x00007fff8fa177e1 start + 0
4   ???                                 0x0000000000000001 0x0 + 1
Community
  • 1
  • 1
Wolph
  • 78,177
  • 11
  • 137
  • 148
  • This will not produce a core dump anymore, at least not one with the original cause of the error. Fine, you have some sort of stacktrace, but with much less value. No offence, you we're only answering the OPs question, but warnings do apply anyway ;-). Also, consider using a combination of `sprintf` and `write(2, buf, ...)` to avoid `fprintf` which could be tricky in a signal handler (well at least for async signals). Also, I wouldn't call `exit()` but only `_exit()` or `abort()`. But YMMV. – Christian.K Apr 15 '13 at 12:59
  • I agree with you Christian, arguably you could enable this for certain builds if you _really_ don't want to send a core dump. But personally I would just wrap the executable in a script that lets gdb (if available) show the stacktrace from the generated core dump... Seems a much more sensible solution. – Wolph Apr 15 '13 at 13:16