Assembly x86- How does stack pointer keep tack of correct location of local variables after return address is pushed into the stack?

Question

I am currently reading CSAPP and I came across this figure, but there is something I just cannot figure out

In line 1, stack pointer is decremented by 16 and two 8-byte numbers are stored in the stack 0 and 8 byte relative to stack pointer separately, but in line 7, the return address of swap_add is pushed onto the stack, so the stack should look like this now:

And my question is: why in line 8 and 9, the stack pointer is still able to retrieve the correct value from offset 0 and 8? From what I understand, the stack pointer now points to the return address, so in order to get the value back it should be (%rsp), %rsi and 16(%rsp), %rdx, or doesn't the return address be pushed onto the stack? Please explain to me why it works this way, thank you

If you notice, in line 2 the stack pointer is adjusted to make room for 16 bytes. Then the arguments are stored on the stack, relative to the new stack pointer location. Lines 8 and 9 are referencing the new stack pointer location (not the base pointer, which is constant for the duration of the stack frame / function call). Also, your diagram is misleading. The return address is going to be stored 8 bytes above `rbp` not `rsp`. A few tips: use Intel syntax as it is generally more clear, and draw your stack diagrams from high address to low address. — h0r53, Aug 03 '21 at 17:16
"in line 7, the return address of swap_add is pushed onto the stack." It ultimately comes down to the calling convention used, which we cannot know without more details. However, looking at this code it seems safe to assume that the stack frame is restored after the `swap_add` function call. In other words, don't worry about the `call` instruction messing up the stack frame, because it is restored after the function returns. — h0r53, Aug 03 '21 at 17:18
It's true that `call swap_add` pushes a return address on the stack. But then it branches to the code of `swap_add`, which isn't shown but is presumed to end by executing a `ret` instruction, which pops that address right back off. So, in this calling convention, from the caller's point of view, you can think of a `call` instruction as having no net effect on the stack. — Nate Eldredge, Aug 03 '21 at 17:57

h0r53 · Accepted Answer · 2021-08-04T20:05:49.563

3

Why in line 8 and 9, the stack pointer is still able to retrieve the correct value from offset 0 and 8?

Because the stack pointer is only ever modified here:

subq $16, %rsp

and here:

addq $16, %rsp

Although you make a function call to swap_add, and yes technically that internally adjusts the stack frame, the stack frame is restored after that function call completes. So you should only think of a single stack frame unless you want to dive into the swap_add routine (of which the assembly has been omitted so it is out of scope).

From what I understand, the stack pointer now points to the return address

That is incorrect. At the beginning and end of the assembly listed, the return address is at 0(%rsp). When the ret instruction is reached, technically pop %rip is executed, setting the new instruction pointer to the return address.

A few extra notes:

The call instruction automatically pushes the address of the next instruction (the return address for the new function) onto the stack. The ret instruction effectively undoes this.
The calling convention used is what determines how stack frames are adjusted before and after function calls, and who's responsibility it is to perform the adjustment.

edited Aug 04 '21 at 20:05

answered Aug 03 '21 at 17:29

h0r53

3,034
2
16
25

1

Thank you so much, I didn't notice that other called procedure will restore the stack pointer when returning, and I didn't know what %rbp is because I have not reached reading that part yet – Mattmmmmm Aug 03 '21 at 17:37
1

@Mattmmmmm it's possible to access the return address relative to `rsp`, but it is less common to see in practice. Although, in this example it appears that `rbp` is omitted altogether and the stack frames are always relative to `rsp`. Thus, in this calling convention the function itself is responsible for growing and shrinking the stack as needed, and as long as the `rsp` is restored to it's previous value the return address will always be at `8(%rsp)` – h0r53 Aug 03 '21 at 17:39
@Mattmmmmm: The code in your function doesn't set up RBP as a traditional "frame pointer" with `push %rbp` / `mov %rsp, %rbp`. This answer's comments about RBP only apply to debug-build compiler output that does that, or hand-written asm that chooses to do that. And BTW, you normally never access the return address explicitly, only by having `call` push it and `ret` pop it. That implicit stack access is relative to RSP, of course (https://www.felixcloutier.com/x86/ret). `ret` is how we spell `pop %rip`. RBP is only used implicitly by `leave` (and the slow `enter` instruction). – Peter Cordes Aug 03 '21 at 23:26
@PeterCordes I wasn't implying the return address needs to be accessed directly at `8(%rsp)`. Rather, I was hoping to get the point across that based on the calling convention used, the function itself is responsible for growing and shrinking the stack frame, and so long as the return address is at `8(%rsp)` at the end of the function, `ret` / `pop %rip` will appropriately transfer control back to the function caller. Also, you suggested the `rbp` comments only apply to debug builds. For what compiler? x86_64 gcc 11.1 produces assembly with stack frames using `rbp` and `rsp`. – h0r53 Aug 04 '21 at 15:23
The return address needs to be at `0(%rsp)` when `ret` executes, not `8(%rsp)`. For GCC and clang, `-fomit-frame-pointer` is on at `-O1` or higher. For example https://godbolt.org/z/hdT35zfbc. Only the default `-O0` ([consistent debugging, i.e. "debug mode"](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point)) forces use of RBP as a frame pointer. (Even with `-fomit-frame-pointer`, compilers will choose to use RBP as a frame pointer in functions with VLAs / alloca or stack over-alignment). – Peter Cordes Aug 04 '21 at 19:42
@PeterCordes thank you for the clarification. I've updated this answer accordingly. Although I'm curious how stack canaries would change this implementation, that is out of scope for this question. – h0r53 Aug 04 '21 at 20:06
1

@h0r53: A stack canary from `-fstack-protector-strong` would be just like if you had an extra `int64_t` local variable that the compiler allocated space for right below the saved return address (https://godbolt.org/z/1ohd148ez), or below the call-preserved register save slots if any. Or at least above any locals; if there's unused stack space for alignment, it might be above the canary. (https://godbolt.org/z/aK7548e1K shows `-O1 -fno-omit-frame-pointer -fstack-protector-strong` referencing the canary relative to RBP). You can play with the code or options on Godbolt yourself. – Peter Cordes Aug 04 '21 at 20:17

score 3 · Answer 2 · answered Aug 03 '21 at 20:30

It seems that the other answers and comments missed the point of the question and gave many details that obscure the simple answer: the call pushes the return address on the stack, but when the call returns, the return pops the return address, so after the call returns, the stack pointer has the same value that it had before the call.

Assembly x86- How does stack pointer keep tack of correct location of local variables after return address is pushed into the stack?

2 Answers2