-1

I have a quite strange and exotic question. My goal is to program object oriented in C. For this, a common approach is to define function pointers inside a struct and define as first argument an explicit reference to the calling struct:

struct Point {
    int x;
    int y;
    
    int (*sum)(struct Point* p);
};

int sum(struct Point* p) {
    return p->x + p->y;
}

int main() {
    struct Point p;
    p.sum = ∑
    p.sum(&p);
}

However, I was wondering if it is possible to do this without the additional struct Point* argument.

For this I need to manipulate the stack (probably via inline assembly) to have a reference to p which then can be accessed inside sum.

My current idea is to declare an additional local variable right before the function call, which holds a reference to the struct

void* ptr = &p;

and then push this value onto the stack with

__asm__("push %rax\n");

But I couldn't figure out how I can access my pushed value in the sum function. I use GCC on x86.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    When you call `sum()`, a new stack frame is pushed. So you'll need assembly within `sum()` to find the base of the current stack frame, then get the pointer before that. – Barmar Oct 21 '22 at 21:00
  • 2
    It seems likely that it could be made to work, for your particular C implementation and target architecture, with enough time and effort. It also seems likely that the result would be brittle and ugly. If you want to program in C, as opposed to C++, then I recommend sticking to C semantics. – John Bollinger Oct 21 '22 at 21:05
  • `struct Point p;` has not been fully initialised, so adding two members in the function is undefined behaviour. – Weather Vane Oct 21 '22 at 21:45
  • 1
    Seems like an XY problem -- why would you want to do this? Your method functions require a `this` that points at the object, and it needs to be provided by the caller, so why not use an argument? That's what arguments are for. – Chris Dodd Oct 21 '22 at 23:00

1 Answers1

-1

Thanks for your comments. After a bit of gdb debugging, I was able to figure it out:

#include <stdio.h>

#define GET_SELF_REFERENCE \
    __asm__("movq 16(%rbp), %r8\n" \
            "movq %r8, -8(%rbp)\n");

#define CALL_RESULT(object, method, result) \
    void* __##object = &object; \
    __asm__("push %rax\n"); \
    result = object.method(); \
    __asm__("pop %rax\n");


struct Point {
    int a;
    int b;
    int (*sum)();
};

void function() {
    struct Point* st = NULL;
    __asm__("movq 16(%rbp), %r8\n"
            "movq %r8, -8(%rbp)\n");

    printf("%d\n", st->a);
}

int sum() {
    struct Point* self = NULL;
    GET_SELF_REFERENCE

    return self->a + self->b;
}

int main() {
    struct Point p;
    p.a = 12;
    p.b = 49;
    p.sum = &sum;

    int result;

    CALL_RESULT(p, sum, result)
    printf("%d\n", result);

    return 0;
}

I needed to add offset 16 to my rbp to get the address stored in my pr variable.

For those of you wondering why i want to do this: It's just out of curiosity and as part of an university project. I'd never do this in production code as it is really fragile and essentially useless. I just want to implement a "real" object orientation mechanism like in Python where the self parameter is implicitly handed to the function.

  • This will fail if you enable any kind of optimizations, or even just use a different version of the compiler. The fact that it appears to work (for now) is completely accidental. – Chris Dodd Oct 21 '22 at 23:21
  • You are totally right. If I add `-O3` it fails. Probably because the artificially introduced variable will get optimized away. I'm wondering if there's any way to determine the absolute address of my "object" `p` without using the relative offset from `rbp`? If there'd be a way then my code would work. – kl_divergence Oct 21 '22 at 23:27
  • 1
    There's no way to determine where in the stack frame any given object will be allocated -- it depends on all the local vars and how they are used, and changing one can change the allocation of others. You also have problems that if the compiler decides to put anything else into %rax or %r8, you'll clobber it (breaking other code) or it will clobber what you are trying to do. The only safe thing is to actually take the address (`&object`) and pass that as a parameter. You can use macros to hide the param if you want, but wly? – Chris Dodd Oct 21 '22 at 23:33
  • Hard-coding offets into a stack frame can't work reliably or portably to different compiler versions or options. e.g. `-fstack-protector-strong` could use extra stack space. It also breaks when this function might get optimized into a caller. Plus, you aren't telling the compiler which memory you're reading / writing, so it's just never going to work. Also, `push` modifies RSP, which is never safe in inline asm. – Peter Cordes Oct 22 '22 at 02:13
  • See [Would this function be portable in any way (across compilers, platforms, libc implementations etc.)?](https://stackoverflow.com/a/74094048) for details: your use of `push` in inline asm is totally unusable for many of the same reasons as that. Using inline asm to fake a different calling convention for calls made separately with C code is not supported by GCC; there's no safe or good way to do it. – Peter Cordes Oct 22 '22 at 02:14
  • Oh, this answer does already say it's so fragile it's useless, so maybe you already realized that. It might happen to work with GCC `-O0`, since it won't use the red-zone in functions where it can see there's a function call. But you're destroying the R8 register without telling the compiler about it via a clobber, so that's not safe. You could at least use extended asm with an `"=r"` output to load the stack arg the compiler doesn't know about. Also, `(char*)__builtin_frame_address(0)` might be a useful reference point for finding stack args. – Peter Cordes Oct 22 '22 at 02:22