9

I'm working on a setjmp/longjmp custom implementation for x86-64 systems which saves the whole context of the CPU (namely, all xmm, fpu stack, etc; not only callee-save registers). This is written directly in assembly.

The code works fine in minimal examples (when calling it directly from an assembly source). The problem arises when using it with C code, due to the way parameters are passed to the homebrew setjmp/longjmp functions. In fact, SysV ABI for x64_64 systems dictates that arguments should be passed via registers (if they are at most 6). The signature of my functions are:

long long set_jmp(exec_context_t *env);
__attribute__ ((__noreturn__)) void long_jmp(exec_context_t *env, long long val);

Of course, this cannot work as is. In fact, when I enter set_jmp, rdi and rsi have already been clobbered to keep a pointer to env and val. The same applies to long_jmp with respect to rdi.

Is there any way to force GCC, e.g. by relying on some attribute, to force argument passing via stack? This would be much more elegant than wrapping set_jmp and long_jmp with some define which manually pushes the clobbered registers on stack, so as to retrieve them later.

ilpelle
  • 470
  • 5
  • 12
  • 1
    This would break the PCS/ABI. You are approaching from the wrong side. Your assembler code has to follow the ABI. Best is to use C functions with inline-assembler either directly or as wrappers for your actual code. That way you can just specifcy which registers/memory your code clobbers and leave saving/restoring to gcc. – too honest for this site Dec 11 '15 at 15:44
  • 3
    `setjmp` doesn't need to save `rdi` and `rsi`—as you say, they are clobbered inside `setjmp` so why should they be saved? In fact, only the registers marked as “callee-saved” (i.e. rbp, ebx, r12, r13, r14, r15, and of course rsp) must be preserved. – fuz Dec 11 '15 at 15:44
  • I do agree that I'm not respecting the ABI, but there's a reason to that. Everything works fine with `setjmp` and `longjmp` because what is not saved by `setjmp` is actually saved by the caller, in case it is needed, as they are caller-save registers. In my application, I have an interaction with the Linux Kernel which, upon a given interrupt, returns the control to a different part of the user-level code, which, in turn, calls `setjmp`. By this construction, the originally-executing code does not know that `setjmp` is being called, and thus caller-save registers should be saved as well. – ilpelle Dec 11 '15 at 15:52
  • 6
    The C standard defines `setjmp` to be a macro, so if you use that you are not bound to any calling convention. Since you seem to be using gcc anyhow, you can to some checking of the argument by using something like `({ exec_context_t *env = (ENV); ... })`. – Jens Gustedt Dec 11 '15 at 15:56
  • "I have an interaction with the Linux Kernel which, upon a given interrupt, returns the control to a different part of the user-level code" Without more information, this sounds like invitation for disaster, or - at least - security holes. – too honest for this site Dec 11 '15 at 16:32
  • @Olaf sure it could be a security hole, if it's not properly done :) If you are interested in the applications of this, you might be interested in giving a glance to these papers: http://www.dis.uniroma1.it/~pellegrini/publications/pads15t.pdf http://www.dis.uniroma1.it/~pellegrini/publications/pads14.pdf – ilpelle Dec 11 '15 at 16:44
  • 1
    You can use GNU C function attributes to specify which ABI a function uses. I don't think there's a stack-call ABI for x86-64, though, so there probably isn't any attribute that will do what you want. I used the feature to call a hand-written asm function from C++ from Windows or Linux (by declaring the function as using the SysV calling convention, even if that wasn't the default calling convention.) This uncovered a gcc bug, though, so you'll get incorrect code without up-to-date gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66275 – Peter Cordes Dec 11 '15 at 17:12

1 Answers1

1

You can avoid overwriting the registers by calling the function using inline assembly.

#include <stdio.h>

static void foo(void)
{
        int i;
        asm volatile ("mov 16(%%rbp), %0" : "=g" (i));
        printf("%d\n", i);
}

#define foo(x) ({ int _i = (x); \
        asm ("push %0\ncall %P1\nadd $8, %%rsp\n" : : "g"(_i), "i"(foo)); })

int main(int argc, char *argv[])
{
        foo(argc-1);
        return 0;
}

Here, an integer is pushed on the stack, and the function foo is called. foo makes the value available in its local variable i. After returning, the stack pointer is adjusted back to its original value.

dan4thewin
  • 1,134
  • 7
  • 10
  • 1
    This will of course break horribly if the calling function `main()` is not using `%rbp` as a frame pointer, but instead using it as an extra integer register, or not using it at all to avoid the push/pop overhead. The latter is what actually happens for this code if you compile with `-O2`. – Nate Eldredge Jan 30 '16 at 21:53
  • 1
    @NateEldredge: it's only `foo` that's badly designed, but it's just a placeholder for his example. The macro to generate a custom-ABI `call` looks good. The OP would use a custom `setjmp` implementation hand-written in ASM which wouldn't make any of these assumptions. It would start by pushing some registers to get some scratch regs, then save all the caller's state. IDK if `xsave` works well from user-space. – Peter Cordes Jan 30 '16 at 22:13
  • That asm statement isn't safe: [it clobbers the red-zone](https://stackoverflow.com/questions/34520013/using-base-pointer-register-in-c-inline-asm). You need `add $-128, %rsp` before the push, then `add $128 + 8` after. You would normally also need to declare clobbers on *all* registers that aren't call-preserved, for whatever custom calling convention your asm function has. e.g. RAX, RCX, RDX, RDI, RSI, R8-R11, xmm0-15 (or 0-31 with AVX-512), and x87 st0..7. – Peter Cordes Mar 25 '21 at 17:31