It was hard to figure out what you were missing, but I think what you're missing is that the caller has to fix the stack after the called function returns. The caller knows how much it pushed before the call, so it can add esp, some_constant
after the call
instruction to clear the args from the stack, putting ESP back to where it was before the first push.
ESP is call-preserved in all calling conventions. Called functions aren't allowed to return with ESP different from what it was before the call
. If they return with ret
, this could only happen if they copied the return address somewhere else on the stack before running ret
! So it's a pretty obvious restriction that some calling-convention descriptions fail to mention.
Anyway, this means that the caller can assume ESP wasn't modified, so it can save/restore anything else with PUSH/POP.
EBP is also call-preserved in all calling conventions I'm aware of. See https://stackoverflow.com/tags/x86/info (the x86 tag wiki) for calling convention/ABI docs.
Also calling conventions on Wikipedia for short summaries.
Also, your pseudo-code for a function call was really weird and confusing (before I edited the question). It didn't clearly show the boundary between the caller's code and the callee's code. In a previous version of this answer, I thought you were saying the caller's code was pushing EBP, because that came before the working in the function
line.
EIP isn't directly accessible, and can only be modified by jump instructions. CALL pushes a return address and then jumps (note that it pushes the address of the next instruction, so it doesn't run again on return. EIP during the execution of an instruction could be said to point at the next instruction, since relative jumps are encoded with a displacement from the end of the instruction. Same for x86-64 RIP-relative addresses.)
RET pops into EIP. For it to return to the right place, the code has to restore ESP to pointing at the return address pushed by the caller.
Assuming a 32-bit stack-args calling convention like System V i386, I'd write your pseudocode as:
(optional) push ecx or whatever call-clobbered registers you want to save
push arguments on stack
CALL function (pushes a return address, i.e. the addr of the insn after the call)
# code of the called function
(optional) push ebp (and any other call-preserved regs the function wants to use)
working in the function
(optional) pop ebp (and any other regs, in reverse order of pushing)
RET (pops the return address into EIP)
add esp, 8 (for example) to clear args from the stack
(optional) pop ecx or whatever other volatile regs you want to restore
Look at the compiler-generated asm for a real function sometime, like this:
Try with different compiler options or change the source on the Godbolt compiler explorer:
int extern_func(int a);
int foo() {
int a = extern_func(2);
int b = extern_func(5);
return a+b;
}
Compiled with gcc6.2 -m32 -O3 -fno-omit-frame-pointer
to make 32-bit code which uses EBP the way you're assuming, instead of the default omit-frame-pointer mode. I could have used -O0
, but un-optimized asm is so bloated that it sucks to read, and there's nothing confusing that gcc can do here. Also used -fverbose-asm
to get it to mark variable names on operands.
foo:
push ebp
mov ebp, esp # standard prologue
push ebx # save ebx so we have a call-preserved register
sub esp, 16 # reserve space for locals
push 2 # the arg for the first function call
call extern_func
mov ebx, eax # a, # stash the return value where it won't be clobbered by the next call
mov DWORD PTR [esp], 5 # just write the new arg to the stack, instead of add esp, 4 and push 5
call extern_func #
add eax, ebx # tmp90, a # this is a+b as the return value
mov ebx, DWORD PTR [ebp-4] #, ESP isn't pointing to where we pushed EBX, so restore it with a normal MOV load.
leave # and set esp=ebp and pop ebp
# at this point, ESP is back to its value on entry to the function
ret
clang makes some different choices about how to do things (including using esi
instead of ebx
), and does the epilogue with
add eax, esi
add esp, 4
pop esi
pop ebp
ret
So it's a more "normal" sequence: restore ESP to pointing at the registers pushed in the prologue and pop them, again leaving ESP pointing at the return address ready for RET.