39

Take a look at these two functions:

void function1() {
    int x;
    int y;
    int z;
    int *ret;
}

void function2() {
    char buffer1[4];
    char buffer2[4];
    char buffer3[4];
    int *ret;
}

If I break at function1() in gdb, and print the addresses of the variables, I get this:

(gdb) p &x  
$1 = (int *) 0xbffff380
(gdb) p &y
$2 = (int *) 0xbffff384
(gdb) p &z
$3 = (int *) 0xbffff388
(gdb) p &ret
$4 = (int **) 0xbffff38c

If I do the same thing at function2(), I get this:

(gdb) p &buffer1
$1 = (char (*)[4]) 0xbffff388
(gdb) p &buffer2
$2 = (char (*)[4]) 0xbffff384
(gdb) p &buffer3
$3 = (char (*)[4]) 0xbffff380
(gdb) p &ret
$4 = (int **) 0xbffff38c

You'll notice that in both functions, ret is stored closest to the top of the stack. In function1(), it is followed by z, y, and finally x. In function2(), ret is followed by buffer1, then buffer2 and buffer3. Why is the storage order changed? We're using the same amount of memory in both cases (4 byte ints vs 4 byte char arrays), so it can't be an issue of padding. What reasons could there be for this reordering, and furthermore, is it possible by looking at the C code to determine ahead of time how the local variables will be ordered?

Now I'm aware that the ANSI spec for C says nothing about the order that local variables are stored in and that the compiler is allowed to chose its own order, but I would imagine that the compiler has rules as to how it takes care of this, and explanations as to why those rules were made to be as they are.

For reference I'm using GCC 4.0.1 on Mac OS 10.5.7

David
  • 1,101
  • 2
  • 9
  • 11
  • is it important? do you need the variables to be allocated in specific address? – stefanB Jul 09 '09 at 06:09
  • 14
    No, its not important, merely an academic exercise. – David Jul 09 '09 at 06:11
  • Does the level of optimisation affect the answer? Pure guess, but maybe with no/low optimisation, ints are candidates for register allocation but char[4] isn't, and since they're processed differently, the two mechanisms just so happen to put them on the stack in different orders. Even if optimization makes no difference, it's plausible that something else in the way automatics are handled means that ints always go down one route, and arrays always down another. – Steve Jessop Jul 09 '09 at 09:30

10 Answers10

28

I've no idea why GCC organizes its stack the way it does (though I guess you could crack open its source or this paper and find out), but I can tell you how to guarantee the order of specific stack variables if for some reason you need to. Simply put them in a struct:

void function1() {
    struct {
        int x;
        int y;
        int z;
        int *ret;
    } locals;
}

If my memory serves me correctly, spec guarantees that &ret > &z > &y > &x. I left my K&R at work so I can't quote chapter and verse though.

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175
Crashworks
  • 40,496
  • 12
  • 101
  • 170
17

Not only does ISO C say nothing about the ordering of local variables on the stack, it doesn't even guarantee that a stack even exists. The standard just talks about the scope and lifetime of variables inside a block.

sigjuice
  • 28,661
  • 12
  • 68
  • 93
  • 2
    Interesting enough that C11 does not even mention the word "stack". Is there any stackless computer? – pmor May 11 '22 at 15:14
  • @pmor As of today, I'd expect some micro-controllers to have no stack, and thus support no recursion. GPUs are an example too, though they aren't programmed in C. It used to be more common in the early days. – Yakov Galka May 12 '23 at 17:13
12

So, I did some more experimenting and here's what I found. It seems to be based on whether or not each variable is an array. Given this input:

void f5() {
        int w;
        int x[1];
        int *ret;
        int y;
        int z[1];
}

I end up with this in gdb:

(gdb) p &w
$1 = (int *) 0xbffff4c4
(gdb) p &x
$2 = (int (*)[1]) 0xbffff4c0
(gdb) p &ret 
$3 = (int **) 0xbffff4c8
(gdb) p &y
$4 = (int *) 0xbffff4cc
(gdb) p &z
$5 = (int (*)[1]) 0xbffff4bc

In this case, ints and pointers are dealt with first, last declared on the top of the stack and first declared closer to the bottom. Then arrays are handled, in the opposite direction, the earlier the declaration, the highest up on the stack. I'm sure there's a good reason for this. I wonder what it is.

David
  • 1,101
  • 2
  • 9
  • 11
6

Usually it has to do with alignment issues.

Most processors are slower at fetching data that isn't processor-word aligned. They have to grab it in pieces and splice it together.

Probably what's happening is it's putting all of the objects which are bigger than or equal to the processor optimal alignment together, and then packing more tightly the things which may not be aligned. It just so happens that in your example all of your char arrays are 4 bytes, but I bet if you make them 3 bytes, they'll still end up in the same places.

But if you had four one-byte arrays, they may end up in one 4-byte range, or aligned in four separate ones.

It's all about what's easiest (translates to "fastest") for the processor to grab.

lavinio
  • 23,931
  • 5
  • 55
  • 71
  • 1
    Well, here GCC is aligning the stack at 16 bytes by default. Also, even if we were dealing with a 4 byte alignment, the arrays and the integers are the same size (4 bytes a piece), so I don't know why you'd get reordering. – David Jul 09 '09 at 06:14
2

It could also be a security issue?

int main()
{
    int array[10];
    int i;
    for (i = 0; i <= 10; ++i)
    {
        array[i] = 0;
    }
}

If array is lower on the stack than i, this code will loop infinitely (because it mistakenly accesses and zeroes array[10], which is i). By placing array higher on the stack, attempts to access memory beyond the end of the stack will be more likely to touch unallocated memory, and crash, rather than causing undefined behavior.

I experimented with this same code one time with gcc, and was not able to make it fail except with a particular combination of flags that I do not remember now.. In any case, it placed array several bytes away from i.

Nick Lewis
  • 4,150
  • 1
  • 21
  • 22
  • Not likely. There are guard pages for stack overflow and underflow, but nothing between stack frames. – lavinio Jul 09 '09 at 06:13
  • 1
    The security issue here is incorrect code. Yes, it results in a infinite loop with one particular compiler/flags combo. But to me, that's a cold comfort. – Matthew Flaschen Jul 09 '09 at 06:23
  • It is interesting that since arrays are defined at the high end of the stack, you cannot overflow an array to overwrite other non-array variables. – Dog eat cat world May 23 '18 at 20:28
2

The C standard does not dictate any layout for the other automatic variables. It specifically says, however, for the avoidance of doubt, that

[...] The layout of the storage for [function] parameters is unspecified. (C11 6.9.1p9)

It can be understood from that that he layout of storage for any other objects is likewise unspecified, except for the for the few requirements that given by the standard, including that the null pointer cannot point to any valid object or function, and layouts within aggregate objects.

The C standard does not contain a single mention to word "stack"; it is quite possible to do for example a C implementation that is stackless, allocating each activation record from the heap (though these could then perhaps be understood to form a stack).

One of the reasons to give the compiler some leeway is efficiency. However, the current compilers would also use this for security, using tricks such as address-space layout randomization and stack canaries to try to make the exploitation of undefined behaviour more difficult. The reordering of the buffers is done to make the use of canary more effective.

2

I believe it is a security issue, or at the very least a side effect from the measures taken to protect the stack. I was playing around with the example from https://ctf101.org/binary-exploitation/buffer-overflow/ which has the following code:

#include <stdio.h>

int main() {
    int secret = 0xdeadbeef;
    char name[100] = {0};
    read(0, name, 0x100);
    if (secret == 0x1337) {
        puts("Wow! Here's a secret.");
    } else {
        puts("I guess you're not cool enough to see my secret");
    }
}

when I compiled it with defaults, and even with -O0 , secret was being placed 4 bytes before the start of name, making the easy exploit not possible. However, when I added -fno-stack-protector it moved secret to 108 bytes after the start of name, and it became possible to modify the value of secret by placing the desired sequence of bytes at offset 108 of the input.

Sasha Pachev
  • 5,162
  • 3
  • 20
  • 20
  • `-O0` *is* the default. If you enabled any optimization, constant-propagation would make `secret == 0x1337` a constant false and the compiler would optimize away the `int secret` variable entirely (because its address doesn't escape the function), let alone storing / reloading it to/from the stack. Anyway, interesting, yes it would make sense to put arrays above other locals when stack-protector is enabled. Without stack-protector it could go either way; an off-by-one might allow return address overwrite instead of "just" a local. Or a limited overflow might only affect locals not ret addr – Peter Cordes Mar 11 '20 at 01:36
0

My guess is that this has something to do with how the data are loaded into registers. Perhaps, with char arrays, the compiler works some magic to do things in parallel and this has something to do with the position in memory to easily load the data into registers. Try compiling with different levels of optimization, and try using int buffer1[1] instead.

rlbond
  • 65,341
  • 56
  • 178
  • 228
0

Interestingly if you add an extra int *ret2 in function1 then on my system the order is correct whereas its out of order for just 3 local variables. My guess is it's ordered that way due to reflect the register allocation strategy that will be used. Either that or it's arbitrary.

Dean Povey
  • 9,256
  • 1
  • 41
  • 52
0

It's completely up to the compiler. Beyond this, certain procedure variables might never be placed on the stack at all, as they can spend their whole lives within a register.

Alex Gartrell
  • 2,514
  • 5
  • 22
  • 23