Why no full context save on subroutine call?

Question

On subroutine call we save the contents of the pc so to restart our calling routine. But what happens if the called subroutine changes the value of general purpose registers? Don't it cause any problem to the calling subroutine if it have to access the old values stored in the registers?

@dwelch: There are enough SO questions from ASM newbies that write functions wrapped in `pusha / pushf` ... `popf / popa` to show that many people don't understand calling conventions, esp. in 32bit where args aren't passed in registers. I didn't upvote it, because I agree it's not a good question. I think it's possible to write an interesting answer, but I wouldn't have bothered if not for the bounty. — Peter Cordes, Jan 18 '16 at 20:06
so this is asked and answered elsewhere? close as a duplicate then — old_timer, Jan 18 '16 at 21:21
@dwelch: more like, now this exists for us to point at when mentioning this as a side-note about someone's nasty code. I don't remember it being the main question before, just that reading ugly code makes me crazy when people are asking about something else. — Peter Cordes, Jan 18 '16 at 22:54

Cornstalks · Answer 1 · 2016-01-18T21:41:05.497

But what happens if the called subroutine changes the value of general purpose registers?

It depends on which registers the subroutine modifies. Depending on the calling convention, there is a list of registers that subroutines are contractually obligated to not modify (and another list of registers that the subroutine is free to modify).

If a subroutine does not honor this contract and modifies registers it shouldn't have, then bad things happen.

If a subroutine wants to use the registers it is obligated to not modify, it must first save those register values to the stack. Once the register values have been saved, it can then use the registers for new values. When the subroutine is finished, it must use the saved values on the stack to restore the original register values. This way, the subroutine can use the registers as it wants, but to the caller, there are no observable modifications to the registers.

Don't it cause any problem to the calling subroutine if it have to access the old values stored in the registers?

As long as the subroutine follows the calling conventions, no. If the subroutine does not, and it destroys (or "clobbers") the original values in the "preserved" registers, then yeah, it'll cause problems.

Not all registers must be preserved, though. Depending on the calling convention, some registers can be modified by the subroutine. If these registers are important to the caller, then the caller must save these registers to the stack before calling the subroutine, and then use the stack to restore the register values after calling the subroutine.

Peter Cordes · Answer 2 · 2021-12-09T07:32:10.363

There are two conflicting needs:

called functions (callees) need scratch registers to do their work.
callers need some state to survive across a function call.

It would be slow if either caller had to save/restore everything it wanted to keep, or the callee had to save/restore every register it used.

The solution to the conflicting needs is for a calling convention (part of an ABI) to define which registers are call preserved, and which can be clobbered. The call-clobbered registers might not actually be clobbered by the specific function that's called, but the caller must assume they are.

Why not store function parameters in XMM vector registers? My answer there considers the tradeoff between having too many argument-passing registers and not enough call-preserved registers.
What are callee and caller saved registers? - The standard CS terminology for "call clobbered" and "call preserved" are the confusing/misleading "caller saved" and "callee saved", which incorrectly implies that code somewhere actually is saving every register around every function call, like a context switch. My answer there clears that up.

A function that calls another function in a loop will usually keep its loop counter and some other things in a call-preserved register. The called function will either not use the register at all, or will have saved/restored it.

If the calling function has more state than there are call-preserved registers, it has to "spill" some of this state to memory (i.e. save/restore it). Ideally, it can spill unmodified values that don't need saving before the call, only reloading after (e.g. the base address of a non-static array or struct). This is more efficient than putting a round-trip to memory into the dependency chain for something like the loop counter. (This matters if the called function only takes a couple cycles, but couldn't be inlined because it was compiled separately. It also simply saves instructions / code-size.)

x86 has many different calling conventions. See the x86 tag wiki for links. Agner Fog has a good guide to this subject.

In the x86-64 SystemV ABI (used by Linux and OS X):

call-preserved: RBP, RBX, and R12–R15
call-clobbered: xmm0-15, flags, all other integer registers. (including r11, which isn't used for passing anything, so can be used as a scratch reg by wrapper/shim functions.)

It's possible to make non-ABI-compliant functions if you want, where the caller knows which registers the called function actually clobbers. Compilers can do this when using options like gcc -fwhole-program, or link-time optimization. Normally compilers always make ABI-compliant functions, since they don't know for sure that the definition they emitted will be the one that's used at link time. Obviously hand-written ASM can do anything, but doing this by hand for anything but a small set of functions is a maintenance nightmare.

Why no full context save on subroutine call?

2 Answers2

Linked

Related