0

I was looking at understanding how local variables are allocated memory in C. Based on this, the array will be created on the stack. And I thought the stack addressing starts from a higher address and then goes to a lower address. So say I had this:

int a; 
int arr[3];

Say a was at address 100. Then arr would be at address 96 (100 - 4), with the address for arr[3] at 88 (96 - 2 * 4), since int will take 4 bytes.

But in reality, I see something very different happening. If I make an arr of size 1 then it works as expected. But if I increase the array size then the addresses look very different.

It seems like an array of size > 1 does not go on the stack but somewhere else (heap?). Can someone explain to me the gap in addresses between a and arr for size > 1?

Size 1 arr

int main(int argc, char* argv[]) {

int a;
int arr[1];
printf("Address a: %p (%lu)\n", &a, (unsigned long)&a);
printf("Address arr[0]: %p (%lu)\n", arr, (unsigned long)arr);
printf("Address arr[1]: %p (%lu)\n", &arr[1], (unsigned long)&arr[1]);>
}

Address a: 0x7ff7b1fd60cc (140701819822284)
Address arr[0]: 0x7ff7b1fd60c8 (140701819822280)
Address arr[1]: 0x7ff7b1fd60cc (140701819822284)
Size 2 arr

int main(int argc, char* argv[]) {
int a;
int arr[2];
printf("Address a: %p (%lu)\n", &a, (unsigned long)&a);
printf("Address arr[0]: %p (%lu)\n", arr, (unsigned long)arr);
printf("Address arr[1]: %p (%lu)\n", &arr[1], (unsigned long)&arr[1]);
}

Address a: 0x7ff7b3d970bc (140701851021500)
Address arr[0]: 0x7ff7b3d970d0 (140701851021520)
Address arr[1]: 0x7ff7b3d970d4 (140701851021524)
Parth
  • 2,682
  • 1
  • 20
  • 39
  • Addresses of array and struct elements always go "up" so even if `arr` is "lower" than `a`, `arr[1]` can not be lower than `arr[0]`. – aragaer Mar 19 '23 at 15:58
  • duplicates: [Order of parameter and variables on stack in C](https://stackoverflow.com/q/66050493/995714), [Is the order of memory addresses of successively declared variables always descending?](https://stackoverflow.com/q/12438216/995714), [Order of memory allocation in C](https://stackoverflow.com/q/58148075/995714), [How to explain the order of memory address assignment?](https://stackoverflow.com/q/20030830/995714), [Why variables declared in different order in the C language remain an unchanged order in the stack? Is it type-related?](https://stackoverflow.com/q/69056481/995714) – phuclv Mar 19 '23 at 15:58
  • Does this answer your question? [Is the order of memory addresses of successively declared variables always descending?](https://stackoverflow.com/questions/12438216/is-the-order-of-memory-addresses-of-successively-declared-variables-always-desce) – phuclv Mar 19 '23 at 15:59
  • Money quote from that: *From the viewpoint of the C language specification the order of memory locations of subsequently allocated variables is unspecified. Therefore, it depends ...* – Marcus Müller Mar 19 '23 at 15:59
  • Let me take a look at those questions. But I have a follow up question: Why is the delta across compilation same between start of array and a? `printf("Address arr[0] - a: %p (%lu)\n", arr-&a, (unsigned long)(arr-&a));` – Parth Mar 19 '23 at 16:02
  • There is a 20 byte gap - what makes you think that is not still in the stack? – Clifford Mar 19 '23 at 16:04
  • Unrelated to your problem, but please don't use a table for the formatting here. It doesn't really work, and makes the question much harder to write. Just format code and output as normal code-snippets with triple backticks. – Some programmer dude Mar 19 '23 at 16:04
  • Related to the code, technically the `%p` format specifier expects a `void *` pointer. Mismatching format specifier and argument type leads to undefined behavior. You really should cast the pointers to `void *`. Also note that there's no guarantee that `unsigned long` is big enough to hold a pointer. For example using MSVC the `long` type is still 32 bits, even for 64-bit targets. – Some programmer dude Mar 19 '23 at 16:06
  • @Clifford *why the gap though*? Should the array not occupy the next memory address? When the array is of size 1, this seems to be happening (presumably because the size 1 array is just a normal int). But when the size increases, I get this gap, and why is that? – Parth Mar 19 '23 at 16:10
  • @Clifford also it seems like when the size is > 1, the array is assigned a higher memory address vs when the size is 1. When again seems weird. – Parth Mar 19 '23 at 16:11
  • @aragaer: Re “Addresses of array and struct elements always go "up"”: This is not required by the c standard. And designed array addresses to increase in the opposite order of stack growth could mitigate buffer overflow exploits. – Eric Postpischil Mar 19 '23 at 16:19
  • 1
    @Parth why is that weird? Second guessing the undefined behaviour of a compiler is a fools game. With respect to the gap, one possible reason is that given in my answer. You will likely get different results in an optimised or release build. Also there will in any case be gaps required for address alignment, which is implementation defined. – Clifford Mar 19 '23 at 16:45

3 Answers3

1

And I thought the stack addressing starts from a higher address and then goes to a lower address.

Most current ABIs1 do specify that stacks grow to lower addresses, so if routine A calls routine B, routine B’s stack frame will be at a lower address.

However, a compiler planning its layout of local variables generally does not have to put them in any order in the stack frame. Within one stack frame, the compiler may arrange data freely.

With optimization, local variables might not be stored on the stack at all; the compiler might keep their values in registers or eliminate them entirely.

If a is at address 100, the compiler is putting arr just below a, int is four bytes, the address space is “flat” (a simple numbering of bytes), and array elements are laid out at increasing addresses for increases indices, then arr will start at address 88, because arr is three elements of four bytes, so it needs 12 bytes, so it will start at 100 − 12 = 88. arr[0] will start at 88, arr[1] will start at 92, and arr[2] will start at 96.

Note that although the stack grows to lower addresses, that does not mean we put arr[0] on the stack first, then arr[1], then arr[2]. The typical plan is to put the entire array on the stack as one object. Within the array, its elements are laid out with the address order matching the index order: arr[0] is first in memory (at the lowest address), arr[1] is next (the next higher-addressed element), and arr[2] is after that.

Footnote

1 Application Binary Interface, a specification of how compiled code (binary/executable code) interacts with other compiled code.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • But if you look at my output for array of size 2, the address for array is higher than address of a. So it seems like the stack is growing towards a higher memory address. Also, there is a gap between a and arr[0] that is not equal to 4 bytes. – Parth Mar 19 '23 at 16:16
  • For array of size 2: `Address a: 0x7ff7b3d970bc (140701851021500), Address arr[0]: 0x7ff7b3d970d0 (140701851021520)` – Parth Mar 19 '23 at 16:16
  • 1
    @Parth: As this answer says, “a compiler planning its layout of local variables generally does not have to put them in any order in the stack frame.” – Eric Postpischil Mar 19 '23 at 16:17
  • Even if that is true, I have seen it very consistently place the array at a higher memory address. Why does that happen? – Parth Mar 19 '23 at 16:28
  • 1
    @Parth because it's just how your compiler is written; no specific reason. It's not guaranteed, no matter what you have seen, and I assure you, I very frequently see my compiler layouting the stack in whatever way is most efficient, not in the order I declared variables. It's a very simple, common and uncomplicated optimization to use an efficient layout. – Marcus Müller Mar 19 '23 at 17:06
0
int a; 
int arr[3];

Say a was at address 100. Then arr would be at address 96 (100 - 4), with the address for arr[3] at 88 (96 - 2 * 4), since int will take 4 bytes.

No. It's simple as that: C doesn't state anything about memory layout when declaring variables (that's different for fields declared within structs).

Wouldn't you be doing an operation on both a and arr that actually required them to be moved to memory (by taking their address and doing something with it, you force that they have an address!), there's no reason that they have a proper memory address at all.

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • @Parth Also note that the size of `int` may be less than or greater than 4 bytes. And your code invokes undefined behaviour because the `%p` format specifier expects a `void *`, not an `unsigned long`. – Harith Mar 19 '23 at 15:57
  • @Haris I am printing both the pointer and its conversion to int. Also, I was asking wrt my system where I confirmed that the size of int is 4. – Parth Mar 19 '23 at 15:59
  • 1
    @Parth still undefined behaviour. The *language definition* doesn't care what *your* computer looks like. – Marcus Müller Mar 19 '23 at 16:00
  • @Marcus, I noticed that `printf("Address arr[0] - a: %p (%lu)\n", arr-&a, (unsigned long)(arr-&a)); `, will always print the same value across compilations. Even though the addresses change, the delta remains the same. Why is that? – Parth Mar 19 '23 at 16:00
  • @Parth Indeed, I didn't see that. But the arguments to the `%p` are still not `void` pointers, so the behaviour remains undefined. – Harith Mar 19 '23 at 16:02
  • @Parth it's how your compiler works. It's not inherent to the code you write, and could be different every execution, different between different compilers, or compiler versions, different between different machines or depend on the phase of the moon. You need to make a mental distinction between the *behaviour you observe* and the *way things are defined*. The only thing that you can expect are the things that are actually defined. – Marcus Müller Mar 19 '23 at 16:06
0

Whilst it is true in most architectures the stack grows from high to low memory address, and that most C implementations use stack allocation for local variables; it does not follow that later declared variables will be at lower addresses than earlier. The compiler need not allocate variables in the order declared let alone in the same order as stack growth.

The stack pointer is typically decremented for the function's entire stack frame on entry to the function. That is the stack allocation occurs for all of a function's non-static/non-register-allocated variables and return address at once. The compiler is then free to order and align those variables as it seems fit within that allocated stack frame.

Moreover often in a debug build, a compiler may add padding between variables as a means of overrun detection in the debugger.

So to demonstrate the stack growth behaviour the following would be more instructive:

void frame()
{
    static int call_depth = 0 ;
    int a;
    int arr[2];

    call_depth++ ;
    printf("Call stack depth %d: ", call_depth ) ;

    printf("Address       a:    %p\n", &a);
    printf("Address  arr[0]:    %p\n", arr);
    printf("Address  arr[1]:    %p\n\n", &arr[1]);
    
    if( call_depth < 5 )
    { 
        frame() ;
    }
}

int main()
{
    frame() ;
}
Clifford
  • 88,407
  • 13
  • 85
  • 165