-2

I am trying to see that for a given function, memory allocation on stack segment of memory will happen in contiguous way. So, I wrote below code and I got below output.

For int allocation I see the memory address are coming as expected but not for character array. After memory address 0xbff1599c I was expecting next address to be 0xbff159a0 and not 0xbff159a3. Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8

All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array.

My base assumption is that on stack segment, memory will always be contiguous. I hope that is not wrong.

#include <stdio.h>

int main(void)
{
    int x = 10;
    printf("Value of x is %d\n", x);
    printf("Address of x is %p\n", &x);
    printf("Dereferencing address of x gives %d\n", *(&x));
    printf("\n");

    int y = 20;
    printf("Value of y is %d\n", y);
    printf("Address of y is %p\n", &y);
    printf("Dereferencing address of y gives %d\n", *(&y));
    printf("\n");

    char str[] = "abcd";
    printf("Value of str is %s\n", str);
    printf("Address of str is %p\n", &str);
    printf("Dereferencing address of str gives %s\n", *(&str));
    printf("\n");

    int z = 30;
    printf("Value of z is %d\n", z);
    printf("Address of z is %p\n", &z);
    printf("Dereferencing address of z gives %d\n", *(&z));
}

Output:

Value of x is 10
Address of x is 0xbff159ac
Dereferencing address of x gives 10

Value of y is 20
Address of y is 0xbff159a8
Dereferencing address of y gives 20

Value of str is abcd
Address of str is 0xbff159a3
Dereferencing address of str gives abcd

Value of z is 30
Address of z is 0xbff1599c
Dereferencing address of z gives 30
hagrawal7777
  • 14,103
  • 5
  • 40
  • 70
  • 1
    Please show us the code you generated the output with as that output couldn't have been generated by the code in your question. – fuz Jan 31 '16 at 14:39
  • @KerrekSB My bad buddy, thanks for pointing it out. Question updated. – hagrawal7777 Jan 31 '16 at 14:39

3 Answers3

2

Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8

char takes up 1 byte , but str is string and you did not count '\0' which is at the end of string and thus ,char str[]="abcd" takes up 5 bytes.

ameyCU
  • 16,489
  • 2
  • 26
  • 41
  • Makes sense but still it doesn't explain - "* I was expecting next address to be 0xbff159a0 and not 0xbff159a3*" – hagrawal7777 Jan 31 '16 at 14:41
  • 1
    @hagrawal There is no guarantee whatsoever how the stack frame is laid out. It seems like your compiler chose to add the required three bytes of padding before `str`, not after. – fuz Jan 31 '16 at 14:49
  • Oh, ok. So does it mean memory allocation on stack is not contiguous? Why padding is required at all, before or after? – hagrawal7777 Jan 31 '16 at 14:52
  • 1
    @hagrawal Same types may be contagiously allocated because of alignment requirements. And access objects which are not improperly aligned can cause error (_may be platform dependent_), therefore it is required. – ameyCU Jan 31 '16 at 15:12
1

I think this could be because the addresses are aligned to boundaries(e.g. 8 byte boundary)?.

The allocations are always aligned to boundaries and allocated in chunks in some OS. You can check using a structure. For example, struct A { char a; char b; int c; };

The size of the struct will not be 6 bytes on a UNIX/LINUX platform.

But it might vary from OS to OS.

Similar thing apply to other data types also . Moreover, a string just points to an address allocated in a heap if malloc is used and the allocation logic might vary from OS to OS. The following is output from Linux box for the same program.

Value of x is 10 Address of x is 0x7ffffa43a50c Dereferencing address of x gives 10

Value of y is 20 Address of y is 0x7ffffa43a508 Dereferencing address of y gives 20

Value of str is abcd Address of str is 0x7ffffa43a500 Dereferencing address of str gives abcd

Value of z is 30 Address of z is 0x7ffffa43a4fc Dereferencing address of z gives 30

Umamahesh P
  • 1,224
  • 10
  • 14
1

Both answers from @ameyCU and @Umamahesh were good but none was self-sufficient so I am writing my answer and adding more information so that folks visiting further can get maximum knowledge.

I got that result because of concept called as Data structure alignment. As per this, computer will always try to allocate memory (whether in heap segment or stack segment or data segment, in my case it was stack segment) in chunks in such a way that it can read and write quickly.

When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory.

On a 32 bits architecture, computers word size is 4 bytes, so computer will always try to allocate memory with addresses falling in multiple of 4, so that it can quickly read and write in block of 4 bytes. When there are lesser number of bytes then computer does padding of some empty bytes either in start or end.

In my case, suppose I use char str[] = "abc"; then including EOL character '\0' I have requirement of 4 bytes, so there will be no padding. But when I do char str[] = "abcd"; then including EOL character '\0' I have requirement of 5 bytes, now computer wants to allocate in block of 4 so it will add padding of 3 bytes (either in start or end) and hence complete char array will be spanned over 8 bytes in memory.

Since int, long memory requirement is already in multiple of 4 so there is no issue and it gets tricky with char or short which are not in multiple of 4. This explains the thing which I reported - "All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array."

Rule of thumb is that if your memory requirement is not in multiple of 4 (for example, 1 short, char array of size 2) then extra padding will be added and then memory allocation will happen, so that computer can read and write quickly.


Below is nice excerpt from this answer which explains data structure alignment.

Suppose that you have the structure.

struct S {
    short a;
    int b;
    char c, d;
};

Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):

 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d|  bytes
|       |       |  words

The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.

But if the memory was laid out as:

 0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
|       |       |       |

Then access to b becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)

Community
  • 1
  • 1
hagrawal7777
  • 14,103
  • 5
  • 40
  • 70