Linux: executing code that is loaded to memory manually

Question

I'm expermenting with function pointers on Linux and trying to execute this C program:

#include <stdio.h>
#include <string.h>

int myfun() 
{
    return 42;
}

int main()
{
    char data[500];
    memcpy(data, myfun, sizeof(data));
    int (*fun_pointer)() = (void*)data;
    printf("%d\n", fun_pointer());

    return 0;
}

Unfortunately it segfaults on fun_pointer() call. I suspect that it is connected with some memory flags, but I don't found information about it.

Could you explain why this code segfaults? Don't see to the fixed data array size, it is ok and copying without calling the function is successfull.

UPD: Finally I've found that the memory segment should be marked as executable using mprotect system call called with PROT_EXEC flag. Moreover the memory segment should be returned by mmap function as stated in the POSIX specification. There is the same code that uses allocated by mmap memory with PROT_EXEC flag (and works):

#include <stdio.h>
#include <string.h>
#include <sys/mman.h>

int myfun() 
{
    return 42;
}

int main()
{
    size_t size = (char*)main - (char*)myfun;
    char *data = mmap(NULL, size, PROT_EXEC | PROT_READ | PROT_WRITE,
        MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    memcpy(data, myfun, size);

    int (*fun_pointer)() = (void*)data;
    printf("%d\n", fun_pointer());

    munmap(data, size);
    return 0;
}

This example should be complied with -fPIC gcc option to ensure that the code in functions is position-independent.

Almost certainly same error as: http://stackoverflow.com/a/7432634/168175 — Flexo, Dec 28 '15 at 08:11
Could you perhaps explain some reason why you would it expect it not to segfault? It seems obvious to me that if you try to execute a random pointer that doesn't point into code, it will segfault. — David Schwartz, Dec 28 '15 at 08:15
The pointer points to copy of code of `myfun` because I copied this code to the `data` array. — Alexander Rodin, Dec 28 '15 at 08:20
when compiling, always enable all the warnings. (for gcc, at a minimum use: `-Wall -Wextra -pedantic`) then fix those warnings. The posted code emits a whole string of warnings. — user3629249, Dec 29 '15 at 14:43
I have used similar code and that has made my program work - almost. I ported it to c++, I can edit variables and return them in the function, but I cannot execute other functions from the "myfun" function. they result in a seg fault. Any way around this? — Nuclear_Man_D, Jan 04 '17 at 08:16

score 5 · Answer 1 · answered Dec 28 '15 at 08:11

5

Several problems there:

Your data array stays in data segment, not in code segment.
The address relocation is not handled.
The code size is not known, just guessed.

answered Dec 28 '15 at 08:11

dlask

8,776
1
26
30

1

I thought about the second and the third problems, and think that in this toy example these problems should not affect. The difference between data and code segment is what I would like to understand. If I replace stack array to allocated by `malloc` the result is the same. So I want to find out is there a way in Linux to mark memory segment as "code", but not "data" during run-time. – Alexander Rodin Dec 28 '15 at 08:23
@a-rodin; There is a way to mark memory segment as code at run time, but surely not the way you are trying. – haccks Dec 28 '15 at 08:36
I think I solved all of these three problems in the question update. The only problem is that I'm not sure that the functions in the executable have the same order as they have in the source file, and I didn't found any documentation that ensures it. – Alexander Rodin Dec 28 '15 at 11:57

score 2 · Answer 2 · edited May 23 '17 at 12:17

In addition to Diask's answer you probably want to use some JIT compilation techniques (to generate executable code in memory), and you should be sure that the memory zone containing the code is executable (see mprotect(2) and the NX bit; often the call stack is not executable for security reasons). You could use GNU lightning (quickly emitting slow machine code), asmjit, libjit, LLVM, GCCJIT (able to slowly emit fast optimized machine code). You could also emit some C code in some temporary file /tmp/emittedcode.c, fork a compilation command gcc -Wall -O -fPIC -shared /tmp/emittedcode.c -o /tmp/emittedcode.so then dlopen(3) that shared object /tmp/emittedcode.so and use dlsym(3) to find function pointers by their name there.

See also this, this, this, this and that answers. Read about trampoline code, closures, and continuations & CPS.

Of course, copying code from one zone to another usually don't work (it has to be position independent code to make that work, or you need your own relocation machinery, a bit like a linker does).

Pierre Emmanuel Lallemant · Answer 3 · 2015-12-28T08:09:15.127

0

It's because this line is wrong:

memcpy(data, myfun, sizeof(data));

You are copying the code (compiled) of the function instead of the address of the function.

myfun and &myfun will have the same adress, so to do your memcpy operation, you will have to use a function pointer and then copy from its address.

Example:

int (*p)(); 
p = myfun; 
memcpy(data, &p, sizeof(data));

edited Dec 28 '15 at 08:09

answered Dec 28 '15 at 08:04

Pierre Emmanuel Lallemant

2,571
1
13
23

1

Yes, it has fixed address, and this address should point to copy of `myfun` code. It is not a real life example, I'm just trying to figure out why can't I execute arbitrary code that the program put to the memory by itself. – Alexander Rodin Dec 28 '15 at 08:07

Linux: executing code that is loaded to memory manually

3 Answers3