4

Using the trivial C program below as an example. main() makes a function call to sum passing in 4 integer. sum() uses 4 locals.

void sum(int a, int b, int c, int d);

void main(void)
{
    sum(11, 12, 13, 14);
}

void sum(int a, int b, int c, int d)
{
    int x;
    int y;
    int z;
    int z2;

    x = a;
    y = b;
    z = c;
    z2 = d;
}

On my Ubuntu server 12.04.04 LTS I compile this program using

arm-linux-gnueabi-gcc -S -mthumb func.c

sum:
@ args = 0, pretend = 0, frame = 32
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
push    {r7}
sub sp, sp, #36    <===   why is this 36 and not 32 bytes?
add r7, sp, #0

str r0, [r7, #12]
str r1, [r7, #8]
str r2, [r7, #4]
str r3, [r7, #0]   <- paramaters passed

ldr r3, [r7, #12]
str r3, [r7, #16]  <- locals
ldr r3, [r7, #8]
str r3, [r7, #20]
ldr r3, [r7, #4]
str r3, [r7, #24]
ldr r3, [r7, #0]
str r3, [r7, #28]

add r7, r7, #36
mov sp, r7
pop {r7}
bx  lr

It appears that int's a 4 bytes each. 4 locals and 4 arguments for the function makes a total of (4 *4 bytes) + (4 * 4bytes) = 32 bytes and this matches the assembly output "frame = 32".

But why does the stack pointer get decremented by 36 and not just 32?

Andy Fusniak
  • 1,548
  • 1
  • 17
  • 28
  • The return address, I presume. – David Schwartz Mar 09 '14 at 08:20
  • @DavidSchwartz I think the return address is held in a dedicated register, the link register lr – Andy Fusniak Mar 09 '14 at 08:26
  • 1
    the newer arm abi wants to have the stack 64 bit aligned. that is why you will see dummy pushes on compiled code pushing r3 for example when it is never used and doesnt need to be preserved. Perhaps that is what is going on here 32 would be aligned 36 is not but because of the push r7 that makes it aligned again. if this were the answer though I would have expected a two word push and a 32 offset... – old_timer Mar 09 '14 at 18:00
  • I dont see the return address because there is no call here so the compiler woudldnt waste time worrying about it...what version of gcc? perhaps just look at the source code to see what it did and why – old_timer Mar 09 '14 at 18:02
  • arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 – Andy Fusniak Mar 09 '14 at 18:32
  • See also http://stackoverflow.com/questions/20071466/aliging-a-stack-pointer-8-byte-from-4-byte-in-arm-assembly – Jonathan Ben-Avraham Mar 09 '14 at 19:55
  • Your code is non-sensical. When I compile it, it is just `bx lr`. You don't use any calculated values in `sum()`, so the entire routine maybe eliminated. You can not be compiling with any optimizations (or have not specified). A compiler can always reserve more than needed. – artless noise Mar 10 '14 at 16:47
  • For instance, I changed `sum()` to `return x + y + z - z2;` and `gcc` at `-O3` converted it to `movs r0,#22`! Stack rationals (and any rationals in regards to code generation) depend on different **compiler options**. – artless noise Mar 10 '14 at 16:51
  • The code is compiled using arm-linux-gnueabi-gcc -S -mthumb func.c The purpose of the code above was not intended to return any value since the return values have no bearing on the stack frame in this context. The code was merely to demonstrate that the frame size didn't match the SP subtraction. It turns out that the AAPCS wants the stack aligned to 64-bit word boundaries as answered by @auselen and this appears to be the correct answer. – Andy Fusniak Mar 10 '14 at 18:17
  • Potential duplicate of [ARM: Why do I need to push/pop two registers at function calls?](https://stackoverflow.com/q/16120123) – Peter Cordes Mar 03 '22 at 06:30

1 Answers1

3

Procedure call standard for ARM requires 8 byte alignment.

5.2.1.2 Stack constraints at a public interface

The stack must also conform to the following constraint at a public interface:

  • SP mod 8 = 0. The stack must be double-word aligned.

Since you are producing assembly everything is exported by default, so you get 8 byte alignment. (I tried this and gcc doesn't add .global <symbol> directive to static functions when generating assembly. I guess this says even a static function is a public interface or gcc just aligns every function to have 8-byte stack alignment.)

You can use -fomit-frame-pointer to skip pushing r7 then gcc should leave the stack depth at 32.

Community
  • 1
  • 1
auselen
  • 27,577
  • 7
  • 73
  • 114
  • Does this mean the compiler never leaves shadow space for the return address, as we first thought in the previous answer? And to clarify does a push {r7} causes the SP to auto decrement 4 bytes and thus the compiler decrements an extra 4 whilst setting up space for the stack frame, to align to the (SP mod 8 = 0) boundary, and that when calling a leaf function (as shown in the previous answer) that the push {r7, lr} is 8 bytes and thus already aligned? Just checking I understood it correct. – Andy Fusniak Mar 09 '14 at 18:48
  • Return is via r0, not via stack. Yes push as you would expect decreases sp. Yes with fp(r7)+lr it gets aligned at 40. – auselen Mar 09 '14 at 18:54
  • Thanks. By return address, I mean the pointer to the return function. I watched a seminar about the x86 frame stack and the lecturer kept referring to 'Saved PC' (presumably the x86 equiv to the LR on ARM) being on the stack frame. Seems ARM has a dedicated register lr which always holds the return address. I've seen it pushed on the stack, but it would seem from your explanation of the 8-byte alignment that the lr is never part of the stack frame at all- pushed separately? – Andy Fusniak Mar 09 '14 at 19:02
  • http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4127.html, and http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0042e/index.html – Jonathan Ben-Avraham Mar 09 '14 at 19:51
  • You can imagine at machine level there is no such thing as a frame (at least for ARM). As you say ARM has a link register but it also has instructions related to it, BL/BLX (Branch with link, Branch with link, and exchange instruction set) which "copy the address of the next instruction into LR (R14, the link register)". So only thing matters is if you branch to an address (lets say a function) you can return back via LR, but if you want to branch to an other address and still be able to return to the first place, you should save LR first (push to stack). – auselen Mar 09 '14 at 20:38
  • this looks like a nice read: http://stackoverflow.com/questions/15752188/arm-link-register-and-frame-pointer – auselen Mar 09 '14 at 20:39