The CPU has no idea what is function/etc... The ret
instruction will fetch value from memory pointed to by esp
a jump there. For example you can do things like (to illustrate the CPU is not interested into how you structurally organize your source code):
; slow alternative to "jmp continue_there_address"
push continue_there_address
ret
continue_there_address:
...
Also you don't need to restore the registers from stack, (not even restore them to the original registers), as long as esp
points to the return address when ret
is executed, it will be used:
call SomeFunction
...
SomeFunction:
push eax
push ebx
push ecx
add esp,8 ; forget about last 2 push
pop ecx ; ecx = original eax
ret ; returns back after call
If your function should be interoperable from other parts of code, you may still want to store/restore the registers as required by the calling convention of the platform you are programming for, so from the caller point of view you will not modify some register value which should be preserved, etc... but none of that bothers CPU and executing instruction ret
, the CPU just loads value from stack ([esp]
), and jumps there.
Also when the return address is stored to stack, it does not differ from other values pushed to stack in any way, all of them are just values written in memory, so the ret
has no chance to somehow find "return address" in stack and skip "values", for CPU the values in memory look the same, each 32 bit value is that, 32 bit value. Whether it was stored by call
, push
, mov
, or something else, doesn't matter, that information (origin of value) is not stored, only value.
If that's the case, can't we just use push and pop instead of call and ret?
You can certainly push
preferred return address into stack (my first example). But you can't do pop eip
, there's no such instruction. Actually that's what ret
does, so pop eip
is effectively the same thing, but no x86 assembly programmer use such mnemonics, and the opcode differs from other pop
instructions. You can of course pop
the return address into different register, like eax
, and then do jmp eax
, to have slow ret
alternative (modifying also eax
).
That said, the complex modern x86 CPUs do keep some track of call/ret
pairings (to predict where the next ret
will return, so it can prefetch the code ahead quickly), so if you will use one of those alternative non-standard ways, at some point the CPU will realize it's prediction system for return address is off the real state, and it will have to drop all those caches/preloads and re-fetch everything from real eip
value, so you may pay performance penalty for confusing it.