12

Why do we push ebp as the first action in the Callee of an Assembly function?

I understand that then we use mov edi, [ebp+8] to get the passed in variables, but our esp is already pointing to return address of the Caller function. We can easily access the passed in variables with mov edi, [esp+4] or if we pushed the Callee registers, then mov edi, [esp+16].

So, why have that extra register in the cpu (the ebp) which you later have to manage in functions? i.e.

push ebp
mov ebp, esp

...

mov esp, ebp
pop ebp
Artur Grigio
  • 5,185
  • 8
  • 45
  • 65
  • 5
    You don't have to. Compilers will often omit the frame pointer nowadays, if the function doesn't use variable length arrays or `alloca()`. – EOF Mar 16 '16 at 20:29
  • Possible duplicate of [What is exactly the base pointer and stack pointer? To what do they point?](http://stackoverflow.com/questions/1395591/what-is-exactly-the-base-pointer-and-stack-pointer-to-what-do-they-point) – Remy Lebeau Mar 16 '16 at 20:58
  • Why did you put "CALLEE" in all-caps in the title? Are you wondering why the caller doesn't make stack frames as part of the calling convention? It doesn't sound that way, based on the text other than the title. – Peter Cordes Mar 17 '16 at 02:29

2 Answers2

13

It is establishing a new stack frame within the callee, while preserving the stack frame of the caller. A stack frame allows consistent access to passed parameters and local variables using fixed offsets relative to EBP anywhere in the function, while ESP is free to continue being modified as needed while the function is running. ESP is a moving target, so accessing parameters and variables using dynamic offsets relative to ESP can be tricky, if not impossible, depending on how the function uses the stack. Creating a stack frame is generally safer, at the cost of using a few bytes of stack space to preserve the pointer to the caller's stack frame.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 2
    The cost of keeping a frame pointer is not in the single `push [r/e]bp`, it's giving up free use of a caller-saved register on a register-starved (or *very* register starved, for 32-bit) architecture. – EOF Mar 16 '16 at 20:40
  • Having a dynamic offset to the variables might be confusing to humans, but is it really necessary for a Compiler? Is the overhead really worth it? – Artur Grigio Mar 16 '16 at 20:45
  • @EOF `[R/E]BP` is reserved specifically for use as a stack frame pointer. It is not a generic purpose register, like `[R/E]AX`, for instance. – Remy Lebeau Mar 16 '16 at 20:45
  • 1
    @ArturGrigio the compiler can't always know at compile-time how much data has been pushed on the stack at run-time (for example, allocating stack space dynamically, which is non-standard but is supported by some compilers, such as for variable-length stack-based arrays and `alloca()`), so using offsets relative to `ESP` is not always feisible. – Remy Lebeau Mar 16 '16 at 20:46
  • 1
    @RemyLebeau: Have you ever looked at the assembly generated by a modern compiler (like `gcc` or `clang`) on high optimization settings (including `-fomit-frame-pointer`)? – EOF Mar 16 '16 at 20:46
  • 4
    @RemyLebeau The point is, `[r/e]bp` **is** a perfectly normal general-purpose register. – EOF Mar 16 '16 at 20:53
  • @EOF only when stack frames are not being used. – Remy Lebeau Mar 16 '16 at 20:57
  • 1
    @RemyLebeau: No. `[r/e]bp` **is** a general-purpose register like any other. It is *used* as the frame pointer *by convention*. – EOF Mar 16 '16 at 20:58
  • 2
    @EOF: the `enter` and `leave` instructions provide ISA support for that convention, but they're not worth using unless optimizing for code-size instead of speed. (Using stack frames can save code size, even though it costs insns to set up and tear down, because addressing modes using `[e/rsp (+disp8/disp32)]` have to use a SIB byte even though there's no index register. The encoding that would mean `[rsp]` instead means "there's a SIB byte".) So this convention has permeated its way into the ISA to some degree. Fortunately there's nothing really enforcing it, though. – Peter Cordes Mar 16 '16 at 22:35
  • @PeterCordes: If that precluded `rbp` from being a general-purpose register, then `r13` and `r12` would also not be general purpose, since they too require special casing for memory access through them (due to having the same `ModR/M` byte as `rsp` and `rbp`, respectively). – EOF Mar 17 '16 at 01:20
  • @EOF: agreed. I usually say that x86-64 has 15 general purpose registers. Being the best/only choice for some special purposes doesn't necessarily mean they're not general purpose. Almost all of the low 8 regs have some instruction that treats them specially, other than rbx and rbp (if you only count useful instructions. i.e. excluding `xlat` (rbx) and `enter` / `leave`). My point was just that the choice of `rbp` as the base pointer isn't arbitrary; there are ISA reasons for it. (Moreso in 32bit code, where you wouldn't want to choose one that was sometimes needed for a special purpose) – Peter Cordes Mar 17 '16 at 01:30
  • @PeterCordes: And I agree in turn, but my point is that using a frame pointer at all is a colossal waste for 99% of functions. – EOF Mar 17 '16 at 01:34
  • @EOF: agreed on that, too! I always /facepalm at all the newbie questions that make stack frames for no reason, esp. in 64bit mode where the stack pointer doesn't tend to change in the first place. In a function with a lot of instructions that reference the stack, though, it can be a code-size win, but usually still not a performance win. – Peter Cordes Mar 17 '16 at 01:36
  • @RemyLebeau: you said in a comment that variable-length arrays are non-standard. That's incorrect: they're specified by C99. You're right that `alloca` is non-standard. The Linux man page says it's not specified by POSIX, and cites its origins as BSD. – Peter Cordes Mar 17 '16 at 01:38
3

The given answer from Remy is perfect, however here is one small addition, a thing you might also see right after

mov ebp, esp

it's very possible to see instruction such:

sub esp, 20h   ; creating space for local variables with size 20h
sub esp, CCh   ; creating space for local variables with size CCh

along side with an AND call sometimes (like and esp, 0FFFFFFF0h). This is also part of the dealing with the stack and it's done so the stack can be align and be divisible by 16. Of course all this depends on the used calling convention (cdecl, fastcall, stdcall etc.)

Belial
  • 821
  • 1
  • 9
  • 12