4

Let's take the following basic C function and the intentionally unoptimized assembly it produces:

int main() {
    int i = 4;
    return 3;
}

It produces the following (unoptimized) assembly, which all makes sense to me:

main:
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $4, -4(%rbp)
        movl    $3, %eax
        popq    %rbp
        ret

However, as soon as I add in a function call, there are two instructions that I don't quite understand:

void call() {};
int main() {
    int i = 4;
    call();
    return 3;
}
main:
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp     <-- why is it subtracting 16?
        movl    $4, -4(%rbp)
        movl    $0, %eax      <-- why is it first clearing the %eax register?
        call    call
        movl    $3, %eax
        leave
        ret

If the stackframe needs to be 16-byte aligned, how does the subq $16, %rsp help with that? Doesn't the initial pushq %rbp instruction already offset it by 8 and now it's at +24 ? Or what are the points of those two lines in question above?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
carl.hiass
  • 1,526
  • 1
  • 6
  • 26
  • It reserves space in stack. It’s not about alignment, it just needs space there. Right after it puts the variable `i` there, for example. – Sami Kuhmonen Sep 13 '20 at 07:00
  • @SamiKuhmonen ok, but it's an `int` so why doesn't it just reserve 4 bytes or so? – carl.hiass Sep 13 '20 at 07:40
  • 1
    To keep the 16-byte alignment of the stack. – ecm Sep 13 '20 at 08:57
  • 1
    Regarding the stack alignment, keep in mind that a further 8 bytes were pushed by the `call` instruction that transferred control to `main`. So relative to its value *before* the `call`, the stack pointer has moved down by 8+8+16=32 bytes and the 16-byte alignment is maintained. – Nate Eldredge Sep 13 '20 at 17:35
  • @NateEldredge oh, that's helpful, I forgot that `main` is called by a wrapper program and goes through `_start` -- so that now makes sense. Thank you – carl.hiass Sep 13 '20 at 18:33

1 Answers1

5

The first variant stores the local variable in the red zone, a 128 byte area below the stack pointer that is not changed by signal handlers. The second variant cannot use the red zone because the callq instruction writes to the (original) red zone, clobbering the local variable stored there. (The called function could write to the original red zone as well, of course.)

%eax is set to zero because the function definition declares no prototype, so the compiler has to assume it is a variadic function. The %eax (actually %al) is used to optimize the implementation of variadic functions.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • 2
    Indeed: C is not C++ and `void call()` isn't a declaration of a function with *no* arguments, it's a declaration of a function with *unspecified* arguments. OP: You wanted to write `void call(void)` and then you'll see that instruction go away. – Nate Eldredge Sep 13 '20 at 16:41
  • @Florian -- could you please explain what a "signal handler" is? Or perhaps link to where I might learn more about it. – carl.hiass Sep 13 '20 at 18:33
  • It's essentially a userspace interrupt. The kernel pushes some state on the thread stack and invokes a user-defined routine. That's why the red zone would be overwritten if the kernel wouldn't preserve it in this context. – Florian Weimer Sep 13 '20 at 18:36