Understanding stack allocation and alignment

Question

I'm trying to understand how stack alignment works as described in what is "stack alignment"? but I have trouble getting a small example to demonstrate the said behaviour. I'm examining the stack allocation of my function foo:

void foo() {
    int a = 0;
    char b[16];
    b[0] = 'a';
}

I compiled the source file with gcc -ggdb example.c -o example.out (i.e without any compiler flags) and the assembler dump from gdb reads:

(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>:    push   %ebp
0x08048395 <+1>:    mov    %esp,%ebp
0x08048397 <+3>:    sub    $0x20,%esp
0x0804839a <+6>:    movl   $0x0,-0x4(%ebp)
0x080483a1 <+13>:   movb   $0x61,-0x14(%ebp)
0x080483a5 <+17>:   leave  
0x080483a6 <+18>:   ret    
End of assembler dump.

My stack is allocated in chunks of 16 bytes (I verified this with several other tests). According to the assembler dump here 32 bytes have been allocated because (16 < 4+16 < 32), however I expected integer 'a' to be allocated on the first 16 bytes and then the character array to be allocated on the next 16 bytes (leaving a space of 12 bytes in-between). But it seems both the integer and the character array have been allocated a contiguous chunk of 20 bytes, which is inefficient as per the discussion i referred above. Can someone please explain what I'm missing here?

EDIT: I came to the conclusion that my stack is allocated in chunks of 16 bytes with a program like below:

void foo() {
    char a[1];
}

And the corresponding assembler dump:

(gdb) disassemble foo
Dump of assembler code for function foo:
0x08048394 <+0>:    push   %ebp
0x08048395 <+1>:    mov    %esp,%ebp
0x08048397 <+3>:    sub    $0x10,%esp
0x0804839a <+6>:    leave  
0x0804839b <+7>:    ret    
End of assembler dump.

You can see that 16 bytes have been allocated on the stack for a character array of size 1 (only 1 byte needed). i can increase the size of the array up to 16 and the assembler dump stays the same, but when it is 17, it allocates 32 bytes on the stack. I have run many such samples and the result is the same; stack memory is allocated in chunks of 16 bytes. A similar topic has been discussed in Stack allocation, padding, and alignment but what I'm more keen on finding out is why alignment has no effect in my example.

Where did you get the idea that all stack variables must individually align to 16-byte boundaries? — Oliver Charlesworth, Jan 27 '11 at 14:26
Use a local variable of type double. The int and the char[] already align just fine. — Hans Passant, Jan 27 '11 at 15:38
@Oli: Since stack is being allocated in chunks of 16 bytes, I thought alignment would follow the same (as described in http://stackoverflow.com/questions/672461/what-is-stack-alignment). I even tried using -mpreferred-stack-boundary=4 to see if I can control stack alignment as described in http://stackoverflow.com/questions/1061818/stack-allocation-padding-and-alignment but that didn't have any effect on alignment either. I want to figure out how alignment actually works and how I can control it. — Asiri Rathnayake, Jan 27 '11 at 23:35
@Hans: Tried it with several combinations (double and char[], double and int[] etc.) but it just seems that alignment has no effect. Memory is allocated in chunks of 16 bytes but within that allocated memory all the variables are lined up (one after the other) in a contiguous block. — Asiri Rathnayake, Jan 27 '11 at 23:42
@Asiri: If you were to declare `int a; char b; int c;` on the stack, you would find that they don't all line up in a contiguous block. — Oliver Charlesworth, Jan 27 '11 at 23:54
@Oli: Yes. I also tried `int a = 0; char b = 'a';` and that too confirmed that stack variables are aligned into 4-byte boundaries. I've mentioned this on my comment to your answer below. Thanks. — Asiri Rathnayake, Jan 28 '11 at 13:59

score 5 · Accepted Answer · answered Jan 27 '11 at 14:28

5

I think you're missing the fact that there is no requirement for all stack variables to be individually aligned to 16-byte boundaries.

answered Jan 27 '11 at 14:28

Oliver Charlesworth

267,707
33
569
680

3

This is right. 16 byte stack alignment refers to the alignment of the stack pointer value, but not the address of any particular stack-allocated variable. – caf Jan 27 '11 at 23:54
& caf: I think you are right. I tried the example suggested by Oli `int a = 0; char b = 'a'; int c = 0;` and also tried `int a = 0; char b = 'a';` and in both cases it seemed variables are aligned into 4 byte boundaries (could it be because my machine is a 32 bit one?). Also, aligning variables into 16-byte boundaries would be a waste of space. I think now I understand the distinction between stack-pointer alignment (which is controlled by -mpreferred-stack-boundary=n flag and the variable alignment within the stack. Thanks a lot for the help :) – Asiri Rathnayake Jan 28 '11 at 14:01
The stack must be definitely even aligned, everything else is sugar. :) My guess is that the 16 Byte alignment is usefull (or required?) for 64-Bit systems, and it doesn't hurt to have it on a 32-bit architecture as well. – Devolus Nov 25 '13 at 11:15

score 1 · Answer 2 · answered Dec 17 '11 at 14:12

You can check how extra memory is allocated for your data structure using a tool called pahole http://packages.debian.org/lenny/dwarves . It shows you all the holes of your program: the size of your data if you sum it up and the real size allocated at your stuck

score 0 · Answer 3 · answered Jan 27 '11 at 14:29

0

The usual rule is that variables are allocated on 32-bit boundaries. I'm not sure why you think 16 bytes has any special meaning.

answered Jan 27 '11 at 14:29

James

9,064
3
31
49

score 0 · Answer 4 · answered Jan 27 '11 at 14:54

I've never heard about such a thing as specific stack alignment. If there is alignment requirements for the CPU, alignment is done on all kinds of data memory, no matter if it is stored on the stack or elsewhere. It is starting on an even addresses with 16, 32 or 64 bit of data following.

16 bytes may perhaps be some sort of on-chip cache memory optimization, though that seems a bit far-fetched to me.

score 0 · Answer 5 · answered Jan 27 '11 at 15:51

A good example is to see this on a structure.

struct a{
    int a;
    char b;
    int c;
} a;

On a 32 bit system this would be 4+1+4 bytes if taken separately.

Because the structure and it's members are aligned "char b" will be 4 bytes, taking this to 12 bytes.

struct b{
    int a;
    char b;
    int c;
} __attribute__((__packed__)) b;

Using the packed attribute you can force it to keep it's minimum size. Thus this structure is 9 bytes.

You can check this as well http://sig9.com/articles/gcc-packed-structures

Hope it helpes.

Understanding stack allocation and alignment

5 Answers5

Linked