Arrays memory allocation on stack

Question

I have 2 functions in C:

void func1(unsigned char x)
{
    unsigned char a[10][5];

    a[0][0] = 1;
    a[9][4] = 2;
}

void func2(unsigned char x)
{
    unsigned char a[10][5];

    a[0][0] = 1;
    a[9][4] = 2;

    unsigned char b[10];

    b[0] = 4;
    b[9] = 5;
}

Compiling with:

gcc 7.3 x86-64

-O0 -g

OS:

16.04.1-Ubuntu x86-64

Produced intel assembly of functions:

func1(unsigned char):
  pushq %rbp
  movq %rsp, %rbp
  movl %edi, %eax
  movb %al, -68(%rbp)
  movb $1, -64(%rbp)
  movb $2, -15(%rbp)
  nop
  popq %rbp
  ret

func2(unsigned char):
  pushq %rbp
  movq %rsp, %rbp
  movl %edi, %eax
  movb %al, -84(%rbp)
  movb $1, -64(%rbp)
  movb $2, -15(%rbp)
  movb $4, -74(%rbp)
  movb $5, -65(%rbp)
  nop
  popq %rbp
  ret

I can see that for 50 bytes array 64 bytes were allocated. It seems that it's allocating stack on 16 bytes boundary, since for 10 bytes - 16 bytes were allocated.

My questions:

1) Is there some standard of stack alignment on 16 byte boundary for arrays? Cause variables like int, long int, etc. clearly aren't allocated on 16 byte boundary.

2) Why arrays allocated in such a weird way? We can see for array a, 14 bytes that were added as alignment padding right after our payload of 50 bytes, but for array b, 6 alignment bytes are allocated before our payload of 10 bytes. Am i missing something?

3) Why function argument passed in EDI (unsigned char x), placed on stack 4 bytes away from our arrays memory beginning(including padding). So byte variables(AL register) are also padded or something? Why 4 bytes?

Different C compilers might output different code. On the other hand inspecting an unoptimized build is futile at best. Besides that, your question has merit from a practical point of view. — DeiDei, Apr 07 '18 at 13:08
On 2) & 3): Option `-O0` means *"Just generate the code as quickly as possible. Don't waste any time considering options to improve the code."* And the code sure looks like it could use some improvements... — Bo Persson, Apr 07 '18 at 13:26
Alignment of `rsp` itself is maintained, so if a function needs that much alignment for anything (e.g. loading/storing an SSE vector) it can get it for free. [Why does System V / AMD64 ABI mandate a 16 byte stack alignment?](https://stackoverflow.com/questions/49391001/why-does-system-v-amd64-abi-mandate-a-16-byte-stack-alignment/49397524#49397524). That of course doesn't imply that every object on the stack is 16-byte aligned. GCC does this even in leaf functions (that don't make any function calls), just as an implementation detail. — Peter Cordes, Apr 07 '18 at 14:34

ensc · Accepted Answer · 2018-04-07T13:22:52.860

x86_64 abi requires a 16 byte stack alignment (stack pointer must be 16 byte aligned when entering a function). But the overalignment seen by you is caused by -O0; with -O1 or higher, the arrays are aligned more efficiently. E.g.

void func2(unsigned char x)
{
    unsigned char a[10][5];

    a[0][0] = 1;
    a[9][4] = 2;

    unsigned char b[10];

    b[0] = 4;
    b[9] = 5;

    __asm__ __volatile__("" :: "r"(a), "r"(b) : "memory");
}

causes:

gcc -O1 -g x.c -c
objdump -d x.o
0000000000000010 <func2>:
  10:   c6 44 24 c0 01          movb   $0x1,-0x40(%rsp)
  15:   c6 44 24 f1 02          movb   $0x2,-0xf(%rsp)
  1a:   c6 44 24 b6 04          movb   $0x4,-0x4a(%rsp)
  1f:   c6 44 24 bf 05          movb   $0x5,-0x41(%rsp)
  24:   48 8d 44 24 c0          lea    -0x40(%rsp),%rax
  29:   48 8d 54 24 b6          lea    -0x4a(%rsp),%rdx
  2e:   c3                      retq

-Os or -O3 create yet other layouts.

Just a note that the ABI requires 16-byte alignment upon _calling_ a function (i.e., at the instant before the `call` is executed), which means that you when actually entering the called function the stack is misaligned by 8 bytes since `call` pushes the return address. — BeeOnRope, Apr 07 '18 at 22:44

Arrays memory allocation on stack

1 Answers1

Linked