66

It's been a while since I last coded arm assembler and I'm a little rusty on the details. If I call a C function from arm, I only have to worry about saving r0-r3 and lr, right?

If the C function uses any other registers, is it responsible for saving those on the stack and restoring them? In other words, the compiler would generate code to do this for C functions.

For example if I use r10 in an assembler function, I don't have to push its value on the stack, or to memory, and pop/restore it after a C call, do I?

This is for arm-eabi-gcc 4.3.0.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
richq
  • 55,548
  • 20
  • 150
  • 144
  • 1
    Here is an external link that may be helpful. [APCS intro](http://www.heyrick.co.uk/assembler/apcsintro.html), especially some [different names](http://sourceware.org/ml/binutils/2000-06/msg00240.html) for `register` use. – artless noise Apr 15 '13 at 18:59

6 Answers6

85

It depends on the ABI for the platform you are compiling for. On Linux, there are two ARM ABIs; the old one and the new one. AFAIK, the new one (EABI) is in fact ARM's AAPCS. The complete EABI definitions currently live here on ARM's infocenter.

From the AAPCS, §5.1.1:

  • r0-r3 are the argument and scratch registers; r0-r1 are also the result registers
  • r4-r8 are callee-save registers
  • r9 might be a callee-save register or not (on some variants of AAPCS it is a special register)
  • r10-r11 are callee-save registers
  • r12-r15 are special registers

A callee-save register must be saved by the callee (in opposition to a caller-save register, where the caller saves the register); so, if this is the ABI you are using, you do not have to save r10 before calling another function (the other function is responsible for saving it).

Edit: Which compiler you are using makes no difference; gcc in particular can be configured for several different ABIs, and it can even be changed on the command line. Looking at the prologue/epilogue code it generates is not that useful, since it is tailored for each function and the compiler can use other ways of saving a register (for instance, saving it in the middle of a function).


Terminology: "callee-save" is a synonym for "non-volatile" or "call-preserved": What are callee and caller saved registers?
When making a function call, you can assume that the values in r4-r11 (except maybe r9) are still there after (call-preserved), but not for r0-r3 (call-clobbered / volatile).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
CesarB
  • 43,947
  • 7
  • 63
  • 86
  • Thanks, this seems to ring some bells. I think the first "r0-r4" in your list is a typo, right? +1 (and probably best answer unless there's a radical turn around) – richq Nov 04 '08 at 10:52
  • Yes, it was a typo (and not the only one, but I fixed the other ones before hitting submit the first time - or so I hope). – CesarB Nov 04 '08 at 11:02
  • 1
    "You can download the whole ABI specification and its supporting documents and example code as a ZIP archive from this page." Zip Archive: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0036b/bsabi.zip – jww Jun 23 '11 at 03:58
  • 3
    I think is far easier to remember that you have to save and restore `r4-r11` in case that you are going to use them; that's why they are callee-saved. –  Mar 27 '12 at 09:05
  • To extend amorenoc's comment: `r4-r11` (perhaps with the exception of `r9`) can be considered "safe" when calling a function. `r0-r3` will probably not be preserved after the function call, and depending on how linking is done, neither will `r12` (which can be used as a scratch register). – Leo Apr 13 '12 at 23:47
  • 1
    The comment by Alex is confusing since it's from the callee's point of view. The question discussed here is from the caller's point of view. A caller does NOT need to save r4-r11 when calling a C function. The C function (the callee) will save these registers. Also, why does nobody clarify whether r9 needs to be saved by the caller or not? I believe for an arm-eabi-gcc toolchain, r9 is callee-saved as well. Who can point at a source of information that settles the r9 issue? – Sven Aug 13 '13 at 16:51
  • 1
    To summarize: When calling a C function, the registers r0-r3,r12 (and maybe r9) need to be saved. From my experience, gcc uses r12 as a scratch register inside a function and hence it is not callee-saved even if arm/thumb-interworking is not used. In case of interworking, the linker will generate glue code which uses r12 if an arm function calls a thumb function. – Sven Aug 13 '13 at 17:03
  • I am just going through this PCS document and have this doubt, the variable registers v1-v8 are used for saving local variables, if so, what happens when I allocate more local variables? I am not able to connect stack and these registers... – jxgn Oct 12 '15 at 17:06
32

32-bit ARM calling conventions are specified by AAPCS

From the AAPCS, §5.1.1 Core registers:
  • r0-r3 are the argument and scratch registers; r0-r1 are also the result registers
  • r4-r8 are callee-save registers
  • r9 might be a callee-save register or not (on some variants of AAPCS it is a special register)
  • r10-r11 are callee-save registers
  • r12-r15 are special registers

From the AAPCS, §5.1.2.1 VFP register usage conventions:

  • s16–s31 (d8–d15, q4–q7) must be preserved
  • s0–s15 (d0–d7, q0–q3) and d16–d31 (q8–q15) do not need to be preserved

Original post:
arm-to-c-calling-convention-neon-registers-to-save


64-bit ARM calling conventions are specified by AAPCS64

General-purpose Registers section specifies what registers need be preserved.
  • r0-r7 are parameter/result registers
  • r9-r15 are temporary registers
  • r19-r28 are callee-saved registers.
  • All others (r8, r16-r18, r29, r30, SP) have special meaning and some might be treated as temporary registers.

SIMD and Floating-Point Registers specifies Neon and floating point registers.

Pavel P
  • 15,789
  • 11
  • 79
  • 128
  • 1
    AAPCS64 link is broken, here is the [new one](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst). – swineone Feb 25 '23 at 05:36
20

For 64-bit ARM, A64 (from Procedure Call Standard for the ARM 64-bit Architecture)

There are thirty-one, 64-bit, general-purpose (integer) registers visible to the A64 instruction set; these are labeled r0-r30. In a 64-bit context these registers are normally referred to using the names x0-x30; in a 32-bit context the registers are specified by using w0-w30. Additionally, a stack-pointer register, SP, can be used with a restricted number of instructions.

  • SP The Stack Pointer
  • r30 LR The Link Register
  • r29 FP The Frame Pointer
  • r19…r28 Callee-saved registers
  • r18 The Platform Register, if needed; otherwise a temporary register.
  • r17 IP1 The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
  • r16 IP0 The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
  • r9…r15 Temporary registers
  • r8 Indirect result location register
  • r0…r7 Parameter/result registers

The first eight registers, r0-r7, are used to pass argument values into a subroutine and to return result values from a function. They may also be used to hold intermediate values within a routine (but, in general, only between subroutine calls).

Registers r16 (IP0) and r17 (IP1) may be used by a linker as a scratch register between a routine and any subroutine it calls. They can also be used within a routine to hold intermediate values between subroutine calls.

The role of register r18 is platform specific. If a platform ABI has need of a dedicated general purpose register to carry inter-procedural state (for example, the thread context) then it should use this register for that purpose. If the platform ABI has no such requirements, then it should use r18 as an additional temporary register. The platform ABI specification must document the usage for this register.

SIMD

The ARM 64-bit architecture also has a further thirty-two registers, v0-v31, which can be used by SIMD and Floating-Point operations. The precise name of the register will change indicating the size of the access.

Note: Unlike in AArch32, in AArch64 the 128-bit and 64-bit views of a SIMD and Floating-Point register do not overlap multiple registers in a narrower view, so q1, d1 and s1 all refer to the same entry in the register bank.

The first eight registers, v0-v7, are used to pass argument values into a subroutine and to return result values from a function. They may also be used to hold intermediate values within a routine (but, in general, only between subroutine calls).

Registers v8-v15 must be preserved by a callee across subroutine calls; the remaining registers (v0-v7, v16-v31) do not need to be preserved (or should be preserved by the caller). Additionally, only the bottom 64-bits of each value stored in v8-v15 need to be preserved; it is the responsibility of the caller to preserve larger values.

auselen
  • 27,577
  • 7
  • 73
  • 114
10

The answers of CesarB and Pavel provided quotes from AAPCS, but open issues remain. Does the callee save r9? What about r12? What about r14? Furthermore, the answers were very general, and not specific to the arm-eabi toolchain as requested. Here's a practical approach to find out which register are callee-saved and which are not.

The following C code contain an inline assembly block, that claims to modify registers r0-r12 and r14. The compiler will generate the code to save the registers required by the ABI.

void foo() {
  asm volatile ( "nop" : : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "r14");
}

Use the command line arm-eabi-gcc-4.7 -O2 -S -o - foo.c and add the switches for your platform (such as -mcpu=arm7tdmi for example). The command will print the generated assembly code on STDOUT. It may look something like this:

foo:
    stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    nop
    ldmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    bx  lr

Note, that the compiler generated code saves and restores r4-r11. The compiler does not save r0-r3, r12. That it restores r14 (alias lr) is purely accidental as I know from experience that the exit code may also load the saved lr into r0 and then do a "bx r0" instead of "bx lr". Either by adding the -mcpu=arm7tdmi -mno-thumb-interwork or by using -mcpu=cortex-m4 -mthumb we obtain slightly different assembly code that looks like this:

foo:
    stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    nop
    ldmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, pc}

Again, r4-r11 are saved and restored. But r14 (alias lr) is not restored.

To summarize:

  • r0-r3 are not callee-saved
  • r4-r11 are callee-saved
  • r12 (alias ip) is not callee-saved
  • r13 (alias sp) is callee-saved
  • r14 (alias lr) is not callee-saved
  • r15 (alias pc) is the program counter and is set to the value of lr prior to the function call

This holds at least for arm-eabi-gcc's default's. There are command line switches (in particular the -mabi switch) that may influence the results.

Sven
  • 1,364
  • 2
  • 17
  • 19
  • 2
    Your analysis is *in-correct*; the `lr` is **popped** as the `pc` for a quicker way to return. The answer to your `r9` question is in the [APCS](http://www.cl.cam.ac.uk/~fms27/teaching/2001-02/arm-project/02-sort/apcs.txt). It is called *static base* in this document and the section *Reentrant vs Non-Reentrant Code* is relative. The **APCS** supports several configuration, but `gcc` is generally *re-entrant* without *stack limits*. Especially, *There are dedicated roles for `sb/r9` and `sl/r10` in some variants of the APCS. In other variants they may be used as callee-saved registers* – artless noise Oct 27 '13 at 23:16
  • See [ARM link and frame pointer](http://stackoverflow.com/questions/15752188/arm-link-register-and-frame-pointer) for details on `pc` and `lr`. `r12` is also known as `ip` and can be used during a *prologue* and *epilogue*. It is a *volatile* register. This is important for routines which are parsing the call stack/frames. – artless noise Oct 27 '13 at 23:22
  • In what sense is my analysis concerning `lr` incorrect? I think you misread me. Anyhow, I was presenting the second assembly code snippet as the first one looked like `lr` was callee saved. However, I think it is not. Yes, in the second snippet, `lr` is popped as `pc` as a quicker way to return and I did not explain that, but the point of presenting the second snippet was that it shows that `lr` is not callee saved. – Sven Aug 27 '14 at 14:01
  • 1
    It is true that `lr` is restored to `pc`. But it is not true, that one can expect that the value of `lr` itself is restored. I don't see how this can be wrong. That the value ends up in a register that is not `lr` is completely irrelevant to the question whether `lr` is restored or not. You are right that the set of registers which is restored and is not restored may change as the `-mabi` option changes. – Sven Aug 29 '14 at 20:07
  • I see you have a point in a hypothetical sense; You could write some assembler that returned a function pointer in `lr`. However, I don't see what having the original value of `lr` would do for you. You are executing the code that was in the `lr` upon return, so its original value is explicit by the code executing. Well summed up by Pavel as *r12-r15 are special registers*. The value of the `lr` on calling will be the value of the `pc` on exit. The question of whether the `lr` is restored or not seems bizarre to me. It depends where and whether leaf or not. – artless noise Aug 30 '14 at 00:32
  • Right now, I'm writing an assembler wrapper for nested interrupts on ARM7TDMI. Whether the value of `lr` is preserved by a call to C-code is important, as the value of `lr` must be restored to its prior value by the assembler warpper. So knowing whether `lr` is callee-saved just isn't all that hypothetical. – Sven Apr 02 '15 at 12:12
  • Ah, the registers are banked for interrupts for just this reason. Anyways we agree to disagree. – artless noise Apr 02 '15 at 14:17
  • 2
    This is exactly what I was looking for -- a way to find out which registers are preserved by the specific compiler settings that I am using for my project. Thank you! – TonyK Feb 06 '17 at 21:44
  • @artlessnoise: If there were a convention that functions should return with the same value in LR as they were invoked with, then something like `if (foo) do { } while(bar());` could be processed as `bl foo / test r0,r0 / bne bar`, and the return from `bar` would transfer execution back to the instruction after the `bl`. In practice, situations where guaranteeing the value in r14 on a function return would be useful are sufficiently rare that it's better to allow it to hold an arbitrary value. – supercat Nov 05 '20 at 22:40
4

According ARM's aapcs32 and aapcs64, finally summarized to this:

enter image description here

crifan
  • 12,947
  • 1
  • 71
  • 56
0

There is also difference at least at Cortex M3 architecture for function call and interrupt.

If an Interrupt occurs it will make automatic push R0-R3,R12,LR,PC onto Stack and when return form IRQ automatic POP. If you use other registers in IRQ routine you have to push/pop them onto Stack manually.

I don't think this automatic PUSH and POP is made for a Function call (jump instruction). If convention says R0-R3 can be used only as an argument, result or scratch registers, so there is no need to store them before function call because there shouldn't be any value used later after function return. But same as in an interrupt you have to store all other CPU registers if you use them in your function.

Frik
  • 3
  • 3