All that matters is what's on the stack when execution reaches the top of the function you want to call. It doesn't matter how or in what order it got there, just that ESP
points at a return address, ESP+4
points at the first stack arg, and so on.
A function also doesn't know or care whether you reached it with call
, or with a jmp
tailcall, or even with a jae
conditional tailcall.
You don't even have to use push
at all, you could sub esp, 24
at the top of a function and just use mov
to store your args. (Like gcc -m32 -maccumulate-outgoing-args
does, which used to be good on old CPUs without a stack engine where push
wasn't as efficient.) Why does gcc use movl instead of push to pass function args?
(Of course more efficient calling conventions pass args in registers, only using the stack if there are more than 2 or 3 integer/pointer args. But same difference, the calling convention specifies required state on entry to a function, not how you make that happen.)
Since you even had to ask this question, remember that the CPU is basically a state machine. Every instruction has its documented effect on the architectural state (register and memory contents, including special regs like EFLAGS and the instruction pointer). Other than that, there is no context. It doesn't matter how you reached a state, only that you're in it.
Context matters for performance for things like partial-register stalls, store-forwarding stalls, branch prediction, and in general for overlapping execution of multiple instructions. But not for correctness.
(I'm ignoring Spectre / Meltdown exploits which create known microarchitectural state and then read that into architectural state.)