5

Is there any way to put processor instructions into array, make its memory segment executable and run it as a simple function:

int main()
{
    char myarr[13] = {0x90, 0xc3};
    (void (*)()) myfunc = (void (*)()) myarr;
    myfunc();
    return 0;
}
cadaniluk
  • 15,027
  • 2
  • 39
  • 67
  • 4
    This looks like an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). Tell us what problem you're trying to solve before you ask for assistance solving it some particular way. – David Schwartz May 09 '16 at 17:51
  • I would like to generate x86 code in my program, then run it. – Asking Brother May 09 '16 at 17:53
  • Do you mean something like [VirtualProtect](https://msdn.microsoft.com/en-us/library/windows/desktop/aa366898(v=vs.85).aspx) in Windows? – Tomer May 09 '16 at 17:53
  • 1
    @user6292850: or someone writing a dynamic recompiler/JIT – ninjalj May 09 '16 at 17:55
  • 1
    @user6292850 Not necessarily; OP could be trying to write a [just-in-time compiler](https://en.wikipedia.org/wiki/Just-in-time_compilation). – zwol May 09 '16 at 17:56
  • @AskingBrother Right, but why would you like to do that? What problem do you hope to solve with that method? (For example, some problems are better solved by invoking a linker to make a shared library which the program dynamically links. Some aren't. We can't tell if we don't know the problem.) – David Schwartz May 09 '16 at 17:59
  • 1
    @MartinJames Not true. – zwol May 09 '16 at 18:04
  • There might be a better duplicate than the one I just found, but it's pretty darn close and the answers look sound. – zwol May 09 '16 at 18:06
  • @zwol: the answers in the dup will fail under W^X protections. Your answer here seems better. – ninjalj May 09 '16 at 18:09
  • 4
    To all the naysayers: it's a fine question, and it doesn't indicate he's an evil no good sploiter or anything like that. Me, I had to solve exactly this problem when writing a C interpreter that wanted to interoperate with compiled code. – Steve Summit May 09 '16 at 18:14
  • 1
    I've considered asking a question like this for the simple reason of knowing if it's possible and if I can tinker with it myself just to learn. It's not that OP is evil, or that their trying to do something super complex. Could just be natural curiosity... – DeadChex May 09 '16 at 18:16
  • Man, these comments are what's wrong with Stack Overflow. What's wrong with just wanting to know something out of pure intellectual curiosity? This sort of experimentation is exactly how I got good at computers. Haven't you guys ever poked around at something without necessarily a good reason to do so? – Brennan Vincent Aug 18 '16 at 19:34

3 Answers3

11

On Unix (these days, that means "everything except Windows and some embedded and mainframe stuff you've probably never heard of") you do this by allocating a whole number of pages with mmap, writing the code into them, and then making them executable with mprotect.

void execute_generated_machine_code(const uint8_t *code, size_t codelen)
{
    // in order to manipulate memory protection, we must work with
    // whole pages allocated directly from the operating system.
    static size_t pagesize;
    if (!pagesize) {
        pagesize = sysconf(_SC_PAGESIZE);
        if (pagesize == (size_t)-1) fatal_perror("getpagesize");
    }

    // allocate at least enough space for the code + 1 byte
    // (so that there will be at least one INT3 - see below),
    // rounded up to a multiple of the system page size.
    size_t rounded_codesize = ((codelen + 1 + pagesize - 1)
                               / pagesize) * pagesize;

    void *executable_area = mmap(0, rounded_codesize,
                                 PROT_READ|PROT_WRITE,
                                 MAP_PRIVATE|MAP_ANONYMOUS,
                                 -1, 0);
    if (!executable_area) fatal_perror("mmap");

    // at this point, executable_area points to memory that is writable but
    // *not* executable.  load the code into it.
    memcpy(executable_area, code, codelen);

    // fill the space at the end with INT3 instructions, to guarantee
    // a prompt crash if the generated code runs off the end.
    // must change this if generating code for non-x86.
    memset(executable_area + codelen, 0xCC, rounded_codesize - codelen);

    // make executable_area actually executable (and unwritable)
    if (mprotect(executable_area, rounded_codesize, PROT_READ|PROT_EXEC))
        fatal_perror("mprotect");

    // now we can call it. passing arguments / receiving return values
    // is left as an exercise (consult libffi source code for clues).
    ((void (*)(void)) executable_area)();

    munmap(executable_area, rounded_codesize);
}

You can probably see that this code is very nearly the same as the Windows code shown in cherrydt's answer. Only the names and arguments of the system calls are different.

When working with code like this, it is important to know that many modern operating systems will not allow you to have a page of RAM that is simultaneously writable and executable. If I'd written PROT_READ|PROT_WRITE|PROT_EXEC in the call to mmap or mprotect, it would fail. This is called the W^X policy; the acronym stands for Write XOR eXecute. It originates with OpenBSD, and the idea is to make it harder for a buffer-overflow exploit to write code into RAM and then execute it. (It's still possible, the exploit just has to find a way to make an appropriate call to mprotect first.)

Community
  • 1
  • 1
zwol
  • 135,547
  • 38
  • 252
  • 361
  • 1
    In GNU C, it's a good idea to use `__builtin___clear_cache(executable_area, executable_area+rounded_codesize)` sometime after storing bytes as data, before calling them as code. This will prevent dead-store elimination (and do any necessary sync of non-coherent I-cache on some non-x86 ISAs). On x86 using `mmap` (not malloc) will stop GCC from knowing the stores aren't read *as data* so you can get away without it. But it compiles to zero instructions and is a good idea. [see this](//stackoverflow.com/q/18476002) and [my answer here](//stackoverflow.com/a/55893781) – Peter Cordes Jan 24 '20 at 22:21
6

Depends on the platform.

For Windows, you can use this code:

// Allocate some memory as readable+writable
// TODO: Check return value for error
LPVOID memPtr = VirtualAlloc(NULL, sizeof(myarr), MEM_COMMIT, PAGE_READWRITE);

// Copy data
memcpy(memPtr, myarr, sizeof(myarr));

// Change memory protection to readable+executable
// Again, TODO: Error checking
DWORD oldProtection; // Not used but required for the function
VirtualProtect(memPtr, sizeof(myarr), PAGE_EXECUTE_READ, &oldProtection);    

// Assign and call the function
(void (*)()) myfunc = (void (*)()) memPtr;
myfunc();

// Free the memory
VirtualFree(memPtr, 0, MEM_RELEASE);

This codes assumes a myarr array as in your question's code, and it assumes that sizeof will work on it i.e. it has a directly defined size and is not just a pointer passed from elsewhere. If the latter is the case, you would have to specify the size in another way.

Note that here there are two "simplifications" possible, in case you wonder, but I would advise against them:

  1. You could call VirtualAlloc with PAGE_EXECUTE_READWRITE, but this is in general bad practice because it would open an attack vector for unwanted code exeuction.

  2. You could call VirtualProtect on &myarr directly, but this would just make a random page in your memory executable which happens to contain your array executable, which is even worse than #1 because there might be other data in this page as well which is now suddenly executable as well.

For Linux, I found this on Google but I don't know much about it.

CherryDT
  • 25,571
  • 5
  • 49
  • 74
  • For Linux: http://web.archive.org/web/20090203055327/http://people.redhat.com/drepper/selinux-mem.html – ninjalj May 09 '16 at 18:00
  • @ninjalj I hope you realize that's a page on how to *attack* SELinux. We don't need people thinking of SELinux as an obstacle. See [zwol's answer](http://stackoverflow.com/a/37122499/6292850) for the proper way of doing it. – uh oh somebody needs a pupper May 09 '16 at 18:11
  • @user6292850: incorrect, that's a page from the previous glibc maintainer, Ulrich Drepper, on how to execute dynamically generated code. It cannot be used as an attack, either you can already execute arbitrary code, and don't need to do that, or you don't, and you won't be able to execute that. – ninjalj May 09 '16 at 18:13
  • @ninjalj I think the title of the page "**protection tests**" make it clear that its usefulness is minimal. I never claimed it was written by a blackhat. A programmer who doesn't want to shoot himself (and his users) in the foot wouldn't be trying to subvert SELinux at all. – uh oh somebody needs a pupper May 09 '16 at 18:17
  • @user6292850: it's not about subverting SELinux, it's about securely executing dynamically generated code (no mapping with Write and eXec permissions at the same time), and working also with things such as SELinux's execmem and PAX's W^X. – ninjalj May 09 '16 at 18:19
1

Very OS-dependent: not all OSes will deliberately (read: without a bug) allow you to execute code in the data segment. DOS will because it runs in Real Mode, Linux can also with the appropriate privileges. I don't know about Windows.

Casting is often undefined and has its own caveats, so some elaboration on that topic here. From C11 standard draft N1570, §J.5.7/1:

A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).

(Formatting added.)

So, it's perfectly fine and should work as expected. Of course, you would need to cohere to the ABI's calling convention.

cadaniluk
  • 15,027
  • 2
  • 39
  • 67
  • **"J.5 [Common extensions]" is an appendix in the standard** documenting things *some* implementations allow. It's not guaranteed or "perfectly fine", except on some implementations where it is that simple. On most modern implementations, W^X is the norm; writeable pages are not executable, and vice versa. You have to do implementation-specific things like `mprotect` or `mmap`, or `gcc -zexecstack` and use a local array, for that to work. – Peter Cordes Apr 22 '23 at 02:38
  • You also need `__builtin___clear_cache` in GNU C, at least to prevent dead-store elimination, and also to run special insns on machines without coherent I-cache. – Peter Cordes Apr 22 '23 at 02:39