3

While learning C, I have always viewed function pointers as being like object pointers (i.e. values representing memory addresses), except that, being a function pointer, it is callable like a function. In order to get a better understanding of function pointers, I tried to allocate machine instructions on the stack, have a function pointer to these instructions, and then call them as one would normally call a function. So I wrote a simple function:

int three(void) { return 3; }

and viewed its machine code representation using objdump -d:

00000001400017d0 <three>:
   1400017d0:   b8 03 00 00 00          mov    $0x3,%eax
   1400017d5:   c3                      ret
   1400017d6:   66 2e 0f 1f 84 00 00    cs nopw 0x0(%rax,%rax,1)
   1400017dd:   00 00 00 

I wrote this program to actually allocate the function on the stack and call it:

#include <stdio.h>
#include <stdint.h>

int main(void)
{
    uint8_t f[] = {
        0xb8u, 0x03u, 0x00u, 0x00u, 0x00u,
        0xc3u,
        0x66u, 0x2eu, 0x0fu, 0x1fu, 0x84u, 0x00u, 0x00u,
        0x00u, 0x00u, 0x00u
    };

    printf("%d\n", ((int (*)(void)) f)());



    return 0;
}

but my program seems to exit with a failure when I actully run the code using a recent version of clang.

I tried to verify the objdump with this program:

for (size_t i = 0; i < 1000; i++) {
    printf("%02X ", ((uint8_t *) three)[i]);

    if (i % 16 == 15)
        putchar('\n');
    }
}

and it seemed to be correct.

What exactly is happening here? Is my understanding of function pointers incorrect or is my implementation erroneous? I am using a Windows machine and MSVCRT for my runtime, so this may be causing some issues, but that notwithstanding, it seems that the code should work as intended anyways, because then how else are actual functions implemented?

P.S. It is my understanding that casting between an object pointer and a function pointer is nonstandard C, but clang allows the above code to compile even when using -Wpedantic -Werror even though gcc does not. This may relate to the problem; I'm not sure.

William Ryman
  • 231
  • 2
  • 9
  • 1
    I'd expect this to work (if the optimizer doesn't fiddle with it). To figure out why it doesn't work I'd use a debugger in assembly language mode. By the way only the first two lines are the function. The last two lines are something unrelated (not sure what) – user253751 Mar 21 '23 at 02:57
  • 3
    On one hand you say "I am using a Windows machine and MSVCRT for my runtime", but on the other hand you talk about compiling with clang and gcc and using objdump which are Linux tools. So which is it? – dbush Mar 21 '23 at 02:58
  • 4
    I'd also expect that something has to be done to mark the stack, or at least the page where `f` is, as executable. You could hack it by calling mprotect. – user253751 Mar 21 '23 at 02:58
  • 1
    @dbush I installed `gcc` and `clang` from [here](https://winlibs.com/). – William Ryman Mar 21 '23 at 03:05
  • 1
    You may run into problems if the storage area for the function (which is probably the stack) is not marked as executable. To what extent that's (a part of) your problem, I'm not sure, but it is something to be aware of. – Jonathan Leffler Mar 21 '23 at 03:13
  • 5
    1. Most operating systems protect the memory to prevent homemade hackers from running code from the data or stack memory. It is called DEP. There are many tutorials on the internet on how to work around it. – 0___________ Mar 21 '23 at 03:17
  • 1
    @user253751 sure - windows and mprotect – 0___________ Mar 21 '23 at 03:18
  • 1
    Why not just make a pointer to your function three() and call that way: Trying to put machine code in data is just opening a bigger can of worms for you. – John3136 Mar 21 '23 at 03:27
  • 1
    @John3136 In this case, it was just for experimentation, but in the future I could possibly use it to make dynamic runtime functions (although this would not be portable). – William Ryman Mar 21 '23 at 03:29
  • 1
    In the objdump of the `three` function, what are those instructions immediately after the `ret` instruction? – printf Mar 21 '23 at 03:31
  • 1
    Does this answer your question? [How do function pointers in C work?](https://stackoverflow.com/questions/840501/how-do-function-pointers-in-c-work) – Ken White Mar 21 '23 at 03:33
  • @printf I'm [not sure](https://stackoverflow.com/questions/4798356/amd64-nopw-assembly-instruction). – William Ryman Mar 21 '23 at 03:33
  • @KenWhite Not quite. The replies to [this answer](https://stackoverflow.com/a/5602143/18032524) does seem useful, though. – William Ryman Mar 21 '23 at 03:35
  • 3
    In addition to enabling execution for the memory, you may also have to call SetProcessValidCallTargets to register the address as a valid function. Function calls are complicated nowadays thanks to the need to defend against malware. – Raymond Chen Mar 21 '23 at 03:58
  • @RaymondChen Thanks! I started looking into [`VirtualAlloc`](https://learn.microsoft.com/en-us/windows/win32/memory/data-execution-prevention) and was wondering why it still failed. I'll have to check out `SetProcessValidCallTargets`. – William Ryman Mar 21 '23 at 04:02
  • 1
    All these *exploit mitigations* are optional and for this demonstration, it is probably easier to find the compiler settings to turn them off, than to set them up to make them work with the demonstration. – user253751 Mar 21 '23 at 04:21
  • 2
    To briefly explain the "defend against malware", a lot of malicious attacks on programs involve getting code into a place that holds data, then getting the program to call an invalid pointer (or one you overwrote) that points to the code you wrote. By telling the CPU it can't run code from the places that are supposed to contain data, it becomes a lot harder for the bad guys to do this sort of thing. (This sort of thing is only possible if the program has bugs in it, so don't write bugs :) ) – user253751 Mar 21 '23 at 04:27
  • 1
    Perhaps in order to understand how actual functions are implemented, it would be interesting to look at an actual function being called via an actual function pointer, using a debugger for example. – n. m. could be an AI Mar 21 '23 at 05:44
  • 3
    In order to get this to work, you have to be dead certain about the calling convention used. Who stacks what, which kind of parameters go in which registers and so on. There are multiple different calling conventions for the x86 ISA depending on system. From a general point of view, you also need to be able to execute code out of data memory. This could be blocked by software (the OS) or by hardware (the MMU) or both. If you want to play around with things like this, you are probably better off buying a simple Von Neumann microcontroller board. These allow you to do pretty much anything. – Lundin Mar 21 '23 at 08:01
  • 1
    See also this (old, but AFAIK this has not changed): https://stackoverflow.com/questions/13696918/c-cast-void-pointer-to-function-pointer – nielsen Mar 21 '23 at 10:24

0 Answers0