Is the instruction after a RET always the one after CALL?

Question

In a well-behaved C program, shall the return statement (RET) always return to the instruction following the CALL statement? I know this is the default, but I would like to check if anyone knows or remembers authentic examples of cases where this standard does not apply (common compiler optimization or other things...). Someone told me that it could happen with a function pointer (the function pointer would put the value on the stack, instead of the CALL... I searched for it but I did not see an explanation anywhere).

Let me try to better explain my question. I know that we can use other structures to change the execution flow (including manipulating the stack)... I understand that if we change the return address written on the stack the execution flow will change to the address that was written on the stack. What I need to know is: is there any not unusual execution situation where the next instruction is not the one that follows the CALL? I mean, I would like to be sure that it doesn't happen, unless something unexpected occurs (like a memory access violation that would lead to a structured exception handler).

My concern is whether the commercial application programs in general ALWAYS follow the mentioned pattern. Notice that in this case I have a fixation for exceptions (it is important to know whether they exist in this case, for a research project I'm developing into a M. Sc. program's discipline). I know, for example, that a compiler may, sometimes, change a RET to a JMP (tail-call optimization). I would like to know if something like this may change the order of the instruction that is executed after the RET and, mainly, if the CALL will always be just before the instruction executed after the RET.

You talk about C, but then immediately switch to talking about assembler instructions (and specifically x86 assembler, I assume). At the very least, you should tag your question as "x86"! — Oliver Charlesworth, Mar 09 '12 at 20:54
You're looking at the question backwards. The `ret` instruction has no idea who put the return address on the stack. Maybe it was a `call`, maybe it was `push`, maybe it was stack corruption. It is the `call` instruction that puts the address of the "instruction after the call" onto the stack. — Raymond Chen, Mar 09 '12 at 22:39
Conceptually yes, you are returning to the guy who called you just after the call was made. In real life, exceptions, stack-overflows, compiler optimisations and non-linear code paths may mean that you're returning to just after a semantic call, rather than a real one. — SecurityMatt, Mar 11 '12 at 22:39

score 2 · Answer 1 · answered Mar 10 '12 at 00:25

CALL subroutine address is equivalent to
PUSH next instruction address + JMP subroutine address.

At the same time, PUSH address is nearly equivalent to
SUB xSP, pointer size + MOV [xSP], address.

SUB xSP, pointer size can be replaced by PUSH.

RET is nearly equivalent to
JMP [xSP] followed by ADD xSP, pointer address at the location where JMP leads.

And ADD xSP, pointer address can be replaced by POP.

So, you can see what kind of basic freedom the compiler has. Oh, btw, it can optimize your code such that your function is entirely inlined and there's neither a call to it, nor a return from it.

While somewhat perverse, it's not impossible to devise much weirder control transfers using instructions and techniques highly specific to the platform (CPU and OS).

You can use IRET instead of CALL and RET for control transfer, provided you put the appropriate stuff on the stack for the instruction.

You can use Windows Structured Exception Handling in a way that an instruction that causes a CPU exception (e.g. division by 0, page fault, etc) diverts execution to your exception handler and from there control can be transferred either back to that same instruction or to the next or to the next exception handler or to any location. And most of x86 instructions can cause CPU exceptions.

I'm sure there are other unusual ways for control transfer to, from and within subroutines/functions.

It's not uncommon to see code something like this either:

...
CALL A
A: JMP B
db "some data", 0
B: CALL C ; effectively call C with a pointer to "some data" as a parameter.
...

C:
; extracts the location of "some data" from the stack and uses it.
...
RET

Here, the first call isn't to a subroutine, it's just a way to put on the stack the address of the data stuck in the middle of the code.

This is probably what a programmer would write, not a compiler. But I may be wrong.

What I'm trying to say with all this is that you shouldn't expect to have CALL and RET as the only ways to enter and leave subroutines and you shouldn't expect them to be used for that purpose only and balance each other.

Related Q&As about asm equivalents / pseudo-code for what `call` and `ret` do: [How can I simulate a CALL instruction by using JMP?](https://stackoverflow.com/q/21248227) / [What is the x86 "ret" instruction equivalent to?](https://stackoverflow.com/a/54816685), and [Does it matter where the ret instruction is called in a procedure in x86 assembly](https://stackoverflow.com/q/46714626). Another way to explain `ret` is that it's `pop eip` / `pop rip` (depending on mode). — Peter Cordes, Jan 18 '22 at 17:38

score 2 · Answer 2 · answered Mar 10 '12 at 00:48

2

A "well behaved" C program could be translated by a compiler to a program that does not follow this pattern. For example for obfuscation reasons the code could use a push / ret combination instead of a jmp.

answered Mar 10 '12 at 00:48

John

5,561
1
23
39

score 0 · Answer 3 · answered Mar 09 '12 at 21:07

Excluding virtual memory situations (where a RET may cause a page fault, technically meaning that the thing the RET triggers is the fault handler), I think the main thing worth discussing is that setjmp and longjmp may completely subvert the stack — so you can legitimately CALL something, then have it hop back an arbitrary number of stack frames without ever hitting the RETs.

I guess it's quite conceivable that a longjmp implementation may involve a RET with a modified stack — it'd be up to the vendor on how they wanted to implement that.

score 0 · Answer 4 · answered Mar 09 '12 at 23:45

In a well-behaved C program, shall the return statement (RET) always return to the instruction following the CALL statement?

This is kind of a non sequitur because there's nothing that requires calling a function and returning from it to necessarily map to these instructions, though of course it's quite common. One example of that is when a function gets inlined.

I think it would be very unusual for an x86 targeting compiler to rig things so a ret instruction corresponding to a return statement went somewhere other than the address following the call instruction. But that's something I think might happen occasionally on an ARM processor.

Since an ARM instruction can't always contain a full 32-bits of immediate data, it's common for constants (numeric or string) to be 'embedded' as data in the code stream so the value or a pointer to it can be loaded using a pc (program counter) relative address. Usually these constants are located at a spot where a jump doesn't need to be made just because of the data. One of the more common places for such data would be in the area between the code for two functions. But another spot where that condition holds after a branch made for a function call, since a branch needs to be taken in any case to get to the instructions following the call site (the return from the function). So, it doesn't hurt execution time to place the data just after the call and set the return address to be the address that follows the data. The compiler loads the lr register (which is used by convention to hold the return address) with the address following the data, then issues an unconditional branch to the function. You might not see this too often, but similar techniques to place data in the code segment are common on the ARM.

score 0 · Answer 5 · answered Mar 10 '12 at 00:37

0

Theoretically, a compiler could, given the following code:

return f(), g();

generate assembly along the lines of:

push $g
jmp f

answered Mar 10 '12 at 00:37

R.. GitHub STOP HELPING ICE

208,859
35
376
711

score 0 · Answer 6 · answered Mar 10 '12 at 03:07

Maybe. On some processors there is something called a "delay slot" (sometimes two) which are instructions immediately following branch instructions (including CALL) which are executed as if they were at the target of the branch. This apparent nonsense was added to increase performance, since the instruction pre-fetcher has quite often fetched ahead of the branch instruction by the time it realizes there is a branch. The address pushed by a CALL as the return address is not the address following the CALL if there are delay slot instructions, the return address is the address following the delay slot instruction(s).

http://en.wikipedia.org/wiki/Delay_slot

This introduced complexity in the Instruction Set Architecture (ISA) for that machine, for example what happens if you place branches in the delay slots, what happens if an instruction in the delay slot causes a fault? What happens if there is a trap (like a single step trap)? You can see it gets messy... but a surprising number of older RISC processors have that, like MIPS, SPARC, and PA-RISC.

Is the instruction after a RET always the one after CALL?

6 Answers6