0

I'm learning about signals. I've created a simple code to handle a segmentation fault:

#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

volatile char *p = NULL;

void segv_handler(int signum) {
    char text[100];
    sprintf(text, "Segv handler! p: %p\n", p);
    if (p == NULL)
        p = malloc(sizeof p);
    write(STDOUT_FILENO, text, strlen(text));
}

int main() {
    signal(SIGSEGV, segv_handler);

    *p = 1;
}

I imagined, that after unsuccessful dereference of the pointer p my handler would allocate the memory for p and reexecution of the *p = 1 would work properly. However, what's happening is that segfault happens indefinitely, even after allocating memory for p.

I looked at the assembly generated by the compiler for the main function and it looks like this:

main:
.LFB7:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    leaq    segv_handler(%rip), %rax
    movq    %rax, %rsi
    movl    $11, %edi
    call    signal@PLT
    movq    p(%rip), %rax
    movb    $1, (%rax)
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

As you can see, assembly instruction that causes the segfault is the movb $1, (%rax). Although allocating memory for p worked as planned, the instruction does not use the new address of p, so the segfault happens again.

This raises my question: can we somehow correct the code so it works as planned?

  • 8
    Once memory is corrupted, the system becomes unstable. You can't gracefully recover from SIGSEGV – Eugene Sh. Oct 19 '21 at 17:06
  • 8
    Also note that there is a limited set of [POSIX calls](https://man7.org/linux/man-pages/man7/signal-safety.7.html) that can be made from within a signal handler -- `malloc` isn't one of them. – G.M. Oct 19 '21 at 17:12
  • 4
    The program triggers a segfault only after having read the value of `p`. Even if you afterward successfully allocate memory for `p` to point to, it's too late. Resuming at the point of the segfault should do do exactly what you observe -- attempt the failed store operation again, *not* retry the whole statement from the beginning. – John Bollinger Oct 19 '21 at 17:17
  • 2
    If the intent is to find memory errors in your code, that's what valgrind is for. – dbush Oct 19 '21 at 17:23
  • 1
    At the C-language level, attempting to dereference an invalid pointer produces undefined behavior. This UB is a characteristic of a whole execution of the program, not merely of a single operation, so once you have UB, all bets are off. – John Bollinger Oct 19 '21 at 17:24
  • Did you try to make the **pointer** `volatile`? As it is currently, just the referenced character is volatile. – the busybee Oct 19 '21 at 17:50
  • 2
    @thebusybee Even using `volatile` would not change a thing here. – fuz Oct 19 '21 at 17:59
  • 1
    You can't allocate memory in a signal handler by calling `malloc`, but if you're [Steve Bourne](https://en.wikipedia.org/wiki/Stephen_R._Bourne), you can [do so by calling brk](https://www.in-ulm.de/~mascheck/bourne/segv.html). :-) – Steve Summit Oct 19 '21 at 18:13
  • @fuz Since we don't know anything about the target system and its compiler, we cannot presume how it would behave. UB can be consistent in that system, including repeating the failed instruction. And a little `volatile` might help to re-fetch the pointer. But generally you're right, as the others. – the busybee Oct 19 '21 at 19:08
  • 1
    @SteveSummit *you can do so by calling `brk`.* In case anyone seriously thinks they can do that: not any more. [Neither `brk()` nor `sbrk()` are async-signal-safe](https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03) Nevermind what that would do to any `malloc()` implementation that uses `brk()` under the covers - that would be like treating a bleeding finger with a tourniquet. Around your neck. – Andrew Henle Oct 19 '21 at 19:17
  • 3
    @thebusybee We know a lot about the target system, like that it is based on the x86 architecture. On x86, no double-indirect addressing modes exist. So to load a pointer from a static variable and then dereference it, two instructions are needed. The second instruction is the one to fault in this case and when it re-runs due to a return from the `SIGSEGV` handler, it'll still have the old value of the pointer, causing another segmentation fault. No compiler can change this. – fuz Oct 19 '21 at 19:39
  • 2
    For a signal handler to fix this, it would need to inspect the code that led up to the SIGSEGV and figure out which register to change. Or in a debug build (where every statement compiles to a single block of asm, not mixed with asm from other statements), parse DWARF debug info to find the code address for the start of the block for the current statement, and reset the signal-resume address to that. (And hope that there hadn't already been any side-effects stored back to memory as part of this block, or updates to `register` variables, which yes are a thing for `-O0` debug mode.) – Peter Cordes Oct 20 '21 at 03:16
  • @fuz But we don't know any thing more than x86. The underlying system and/or the compiler can modify the return address or do any thing else like inspecting the failed instruction and repeating it. I have seen this already in my 40+ years of embedded programming. – the busybee Oct 20 '21 at 05:45
  • @thebusybee Given that OP is (obviously) programming against the UNIX API and that there is no code to modify return addresses or similar in there, this seems very unlikely. – fuz Oct 20 '21 at 08:27
  • @SteveSummit: The smart thing to do for more memory would be `mmap(MAP_ANONYMOUS)` to get a fresh page from the kernel, totally unrelated to any other state of anything. – Peter Cordes Jul 21 '22 at 15:25
  • Of course in general, SIGSEGV is usually not recoverable: see [Why is a segmentation fault not recoverable?](https://stackoverflow.com/a/70270762) (and other answers on that question). In a toy example where you didn't expect the pointer to be pointing to some existing data structure (but dereffed anyway), yeah you could allocate. But in general, that will just corrupt your program state more, if a pointer was supposed to be pointing to something other than a fresh page of zeroed memory, or whatever malloc gives you. – Peter Cordes Jul 21 '22 at 15:26

1 Answers1

0

To do this, you have to disassemble your instructions and move the instruction pointer back to the actual beginning of the C-Instruction, not only reexecuting the current assembler instruction.

MaxFragg
  • 11
  • 1
  • 1
    C doesn't have *instructions*. You mean back up to the start of the block of asm instructions that implement that C *statement*. (i.e. to load the pointer from static storage (`movq p(%rip), %rax`), not just the instruction that tries to deref it once it's already in a register (`movb $1, (%rax)`).) Yes, that's true, but x86-64 machine code doesn't unambiguously disassemble backwards ([Is it possible to decode x86-64 instructions in reverse?](https://stackoverflow.com/q/52415761)). – Peter Cordes Jul 21 '22 at 15:32
  • As discussed in comments under the question, you'd need debug info (and a debug build so you know that all asm instructions for a C statement are part of one contiguous block, not mixed with other stuff), and would have to know that the C statement doesn't have other side-effects. Without full debug info (line number to machine-code address mappings), you wouldn't know which instruction was the start of a statement or not, unless you bake in more knowledge about the example into your signal handler. (e.g. disassemble from the previous symbol, the start of the function, until a `mov` load.) – Peter Cordes Jul 21 '22 at 15:33
  • And this would only work for toy examples like this where you intentionally segfault, not where you expected a pointer to be pointing at some existing data-structure consistent with other pointers and data. ([Why is a segmentation fault not recoverable?](https://stackoverflow.com/a/70270762)) - in general you'd just be corrupting things more. – Peter Cordes Jul 21 '22 at 15:35