Segmentation fault when calling a function located in the heap

Question

I'm trying to tweak the rules a little bit here, and malloc a buffer, then copy a function to the buffer.

Calling the buffered function works, but the function throws a Segmentation fault when i'm trying to call another function within.

Any thoughts why?

#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>

int foo(int x)
{
    printf("%d\n", x);
}

int bar(int x)
{
}

int main()
{
    int foo_size = bar - foo;

    void* buf_ptr;

    buf_ptr = malloc(1024);

    memcpy(buf_ptr, foo, foo_size);

    mprotect((void*)(((int)buf_ptr) & ~(sysconf(_SC_PAGE_SIZE) - 1)),
             sysconf(_SC_PAGE_SIZE),
             PROT_READ|PROT_WRITE|PROT_EXEC);

    int (*ptr)(int) = buf_ptr;

    printf("%d\n", ptr(3));

    return 0;
}

This code will throw a segfault, unless i'll change the foo function to:

int foo(int x)
{
    //Anything but calling another function.
    x = 4;
    return x;
}

NOTE:

The code successfully copies foo into the buffer, i know i made some assumptions, but on my platform they're ok.

Too many assumptions and not enough error checking - e.g. you don't seem to be checking whether `foo_size` is less than or equal to 1024 (or even positive) ? — Paul R, May 11 '16 at 07:28
@LPs Assumptions are made because i've checked them. bar > foo on my platform, and i don't know why i should check the equality to page size. — Delights, May 11 '16 at 07:32
I think you can use your code with Position Independent Code due to the function relocation that your code is performing. This due to relative jump to `printf`. — LPs, May 11 '16 at 07:32
It's probably because your code is not position independent. Try to step through the assembly code with your debugger and you'll see what happens. BTW what you try to do is probably undefined behaviour. — Jabberwocky, May 11 '16 at 07:39
@MichaelWalz How come it's undefined? the only assumption i made here is the location of 'bar'. Also, i did try to read the assembly output of this code, it seems that the call to printf is made relative to 'foo'. But it's still an address, so i don't see why this is a problem. 'call foo + X', 'foo' shall turn into a constant address, and X is a constant. — Delights, May 11 '16 at 07:42
For what it's worth, neither version of the code runs on Clang on OS X. And by the way you are missing return statements in `foo()` and `bar()`. — John Zwinck, May 11 '16 at 07:52
@Medals maybe on your platform it's not UB but generally it is. If the call is relative to foo, then the code of foo is not position independant. The call address is relative to the original foo function, but as you simply copy the code of the original foo to the buffer, the relative address of printf will be wrong in the copied foo function. — Jabberwocky, May 11 '16 at 07:56
@Medals Subtraction two pointers that don't point to the same array object is not defined. This is undefined behavior: `int foo_size = bar - foo;` — 2501, May 11 '16 at 09:31
@2501 You're right. my bad. should've been: `(char*)bar - (char*)foo` — Delights, May 11 '16 at 09:55
@Medals Casts don't change the fact the pointers don't point to the same object. — 2501, May 11 '16 at 09:58
@2501 I'm casting these pointers to 'point' to the same objects. Since subtracting them gives me the number of objects between them, and `sizeof(char) == 1`, this is exactly what i need. I think it's quite defined. — Delights, May 11 '16 at 10:09
@Medals *I'm casting these pointers to 'point' to the same objects.* That makes no sense whatsoever. — 2501, May 11 '16 at 10:11
Why do you expect that once you've moved `foo` that the resultant address that the compiler assumes to be `printf` the same? Let's use some number shall we? Assume `printf` is at 100, and `foo` is at 200 the compiled code calls a function at -100 (200-100 = 100) so it calls printf. Now you move `foo` to 800. Since you haven't recompiled the binary (haven't change the instruction to call printf at -100) then it will still call -100 which is now 700. The best case scenario is you get a segfault. The worst case scenario is if you actually have a function defined at 700 and that gets called — slebetman, May 11 '16 at 10:12
@slebetman That's the case when it's a relative call. learned my lesson. — Delights, May 11 '16 at 10:15
@Medals As I said, casts don't and cannot change the fact the pointers don't point to the same object. Do some research instead of asserting it is defined, try to prove yourself wrong, etc... — 2501, May 11 '16 at 10:28
If you want to manipulate assembly code, you should write in assembly language, not in C. — Marc van Leeuwen, May 11 '16 at 15:37

Art · Answer 1 · 2016-05-11T09:46:37.020

Your code is not position independent and even if it were, you don't have the correct relocations to move it to an arbitrary position. Your call to printf (or any other function) will be done with pc-relative addressing (through the PLT, but that's besides the point here). This means that the instruction generated to call printf isn't a call to a static address but rather "call the function X bytes from the current instruction pointer". Since you moved the code the call is done to a bad address. (I'm assuming i386 or amd64 here, but generally it's a safe assumption, people who are on weird platforms usually mention that).

More specifically, x86 has two different instructions for function calls. One is a call relative to the instruction pointer which determines the destination of the function call by adding a value to the current instruction pointer. This is the most commonly used function call. The second instruction is a call to a pointer inside a register or memory location. This is much less commonly used by compilers because it requires more memory indirections and stalls the pipeline. The way shared libraries are implemented (your call to printf will actually go to a shared library) is that for every function call you make outside of your own code the compiler will insert fake functions near your code (this is the PLT I mentioned above). Your code does a normal pc-relative call to this fake function and the fake function will find the real address to printf and call that. It doesn't really matter though. Almost any normal function call you make will be pc-relative and will fail. Your only hope in code like this are function pointers.

You might also run into some restrictions on executable mprotect. Check the return value of mprotect, on my system your code doesn't work for one more reason: mprotect doesn't allow me to do this. Probably because the backend memory allocator of malloc has additional restrictions that prevents executable protections of its memory. Which leads me to the next point:

You will break things by calling mprotect on memory that isn't managed by you. That includes memory you got from malloc. You should only mprotect things you've gotten from the kernel yourself through mmap.

Here's a version that demonstrates how to make this work (on my system):

#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <string.h>
#include <err.h>

int
foo(int x, int (*fn)(const char *, ...))
{
        fn("%d\n", x);
        return 42;
}

int
bar(int x)
{
        return 0;
}

int
main(int argc, char **argv)
{
        size_t foo_size = (char *)bar - (char *)foo;
        int ps = getpagesize();

        void *buf_ptr = mmap(NULL, ps, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANON|MAP_PRIVATE, -1, 0);

        if (buf_ptr == MAP_FAILED)
                err(1, "mmap");

        memcpy(buf_ptr, foo, foo_size);

        int (*ptr)(int, int (*)(const char *, ...)) = buf_ptr;

        printf("%d\n", ptr(3, printf));

        return 0;
}

Here, I abuse the knowledge of how the compiler will generate the code for the function call. By using a function pointer I force it to generate a call instruction that isn't pc-relative. Also, I manage the memory allocation myself so that we get the right permissions from start and not run into any restrictions that brk might have. As a bonus we do error handling that actually helped me find a bug in the first version of this experiment and I also corrected other minor bugs (like missing includes) which allowed me to enable warnings in the compiler and catch another potential problem.

If you want to dig deeper into this you can do something like this. I added two versions of the function:

int
oldfoo(int x)
{
        printf("%d\n", x);
        return 42;
}

int
foo(int x, int (*fn)(const char *, ...))
{
        fn("%d\n", x);
        return 42;
}

Compile the whole thing and disassemble it:

$ cc -Wall -o foo foo.c
$ objdump -S foo | less

We can now look at the two generated functions:

0000000000400680 <oldfoo>:
  400680:       55                      push   %rbp
  400681:       48 89 e5                mov    %rsp,%rbp
  400684:       48 83 ec 10             sub    $0x10,%rsp
  400688:       89 7d fc                mov    %edi,-0x4(%rbp)
  40068b:       8b 45 fc                mov    -0x4(%rbp),%eax
  40068e:       89 c6                   mov    %eax,%esi
  400690:       bf 30 08 40 00          mov    $0x400830,%edi
  400695:       b8 00 00 00 00          mov    $0x0,%eax
  40069a:       e8 91 fe ff ff          callq  400530 <printf@plt>
  40069f:       b8 2a 00 00 00          mov    $0x2a,%eax
  4006a4:       c9                      leaveq
  4006a5:       c3                      retq

00000000004006a6 <foo>:
  4006a6:       55                      push   %rbp
  4006a7:       48 89 e5                mov    %rsp,%rbp
  4006aa:       48 83 ec 10             sub    $0x10,%rsp
  4006ae:       89 7d fc                mov    %edi,-0x4(%rbp)
  4006b1:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
  4006b5:       8b 45 fc                mov    -0x4(%rbp),%eax
  4006b8:       48 8b 55 f0             mov    -0x10(%rbp),%rdx
  4006bc:       89 c6                   mov    %eax,%esi
  4006be:       bf 30 08 40 00          mov    $0x400830,%edi
  4006c3:       b8 00 00 00 00          mov    $0x0,%eax
  4006c8:       ff d2                   callq  *%rdx
  4006ca:       b8 2a 00 00 00          mov    $0x2a,%eax
  4006cf:       c9                      leaveq
  4006d0:       c3                      retq

The instruction for the function call in the printf case is "e8 91 fe ff ff". This is a pc-relative function call. 0xfffffe91 bytes in front of our instruction pointer. It's treated as a signed 32 bit value, and the instruction pointer used in the calculation is the address of the next instruction. So 0x40069f (next instruction) - 0x16f (0xfffffe91 in front is 0x16f bytes behind with signed math) gives us the address 0x400530, and looking at the disassembled code I find this at the address:

0000000000400530 <printf@plt>:
  400530:       ff 25 ea 0a 20 00       jmpq   *0x200aea(%rip)        # 601020 <_GLOBAL_OFFSET_TABLE_+0x20>
  400536:       68 01 00 00 00          pushq  $0x1
  40053b:       e9 d0 ff ff ff          jmpq   400510 <_init+0x28>

This is the magic "fake function" I mentioned earlier. Let's not get into how this works. It's necessary for shared libraries to work and that's all we need to know for now.

The second function generates the function call instruction "ff d2". This means "call the function at the address stored inside the rdx register". No pc-relative addressing and that's why it works.

Well this code is made for fun and only fun, so checking function return values wasn't the first thing on my mind, since it worked on my system. But i got it, thanks. — Delights, May 11 '16 at 08:07
It's possible that `mprotect()` doesn't work on your system because of W^X: https://en.wikipedia.org/wiki/W%5EX — Dietrich Epp, May 11 '16 at 08:12
@Medals IMO checking for errors makes things less frustrating to debug and more fun. — Art, May 11 '16 at 08:19
@DietrichEpp As Theo (the guy who invented the term) has said many times, even quite recently, w^x can't be made mandatory yet. A lot of prominent programs (like firefox for example) will fail if we actually start enforcing it strictly. Currently w^x is just a policy of "don't write code that does this", rather than "the kernel will stop you from doing it". I suspect Linux has just started enforcing maximum protections on the heap we get from `brk` (which should be safe to do). — Art, May 11 '16 at 08:26
On Linux, you should be able to `mprotect()` the result of `brk()` just fine. I'm not aware of any restrictions. What is the error code? Note that the code in the OP shows incorrect usage of `mprotect()`. — Dietrich Epp, May 11 '16 at 09:00
@DietrichEpp Are you sure? I get `EACCESS` on a correct attempt to `mprotect(PROT_EXEC)` memory from `brk` while it works on stuff I get from `mmap`. `mprotect(PROT_READ)` on memory from `brk` works. — Art, May 11 '16 at 09:18
@DietrichEpp see: https://gist.github.com/art4711/181dc7683c757df1874f0dbe7ad20c4f — Art, May 11 '16 at 09:20
I think the x86 is a weird platform. Common, perhaps, but nonetheless weird in instructions and addressing. — mpez0, May 11 '16 at 13:59
Any way I look at it, the computation of foo_size is totally unreliable. What prevents the compiler from e.g. putting bar before foo? — Emil Jeřábek, May 11 '16 at 14:24
@mpez0 Maybe, but not in this case. Pc-relative addressing is done by almost everyone, it's much better than loading constants from tables and stalling on that. And much much better than text modification that would be necessary for PIC relocation with call instructions to constants. If you look at the instruction set and ABI changes on amd64 pc-relative addressing is used for much more now than on i386. This is a good thing. — Art, May 11 '16 at 14:24
@EmilJeřábek Nothing at all. This is not about making this reliable (to do that we'd need linker scripts at least), it's about exploring what's going on. As OP explained he did this for fun to see what happens. I haven't even tried counting the standard violations and undefined behaviors we have in here, but they are numerous. — Art, May 11 '16 at 14:29

score 3 · Answer 2 · answered May 11 '16 at 08:06

The compiler is free to generate the code the way it wants provided the observable results are correct (as if rule). So what you do is just an undefined behaviour invocation.

Visual Studio sometimes uses relays. That means that the address of a function just points to a relative jump. That's perfectly allowed per standard because of the as is rule but it would definitely break that kind of construction. Another possibility is to have local internal functions called with relative jumps but outside of the function itself. In that case, your code would not copy them, and the relative calls will just point to random memory. That means that with different compilers (or even different compilation options on same compiler) it could give expected result, crash, or directly end the program without error which is exactly UB.

score 1 · Answer 3 · answered May 11 '16 at 08:00

I think I can explain a bit. First of all, if both your functions have no return statement within, an undefined behaviour is invoked as per standard §6.9.1/12. Secondly, which is most common on a lot of platforms, and yours apparently as well, is the following: relative addresses of functions are hardcoded into binary code of functions. That means, that if you have a call of "printf" within "foo" and then you move (e.g. execute) from another location, that address, from which "printf" should be called, turns bad.

Segmentation fault when calling a function located in the heap

3 Answers3