How does compiler allocates memory to this struct?

Question

I was trying to use namespaces and structs & encountered an issue.

C++

#include<iostream>
using namespace std;

namespace One
{
    struct Data
    {
        int val;
        char character;
    };
}

namespace Two
{
    struct Data
    {
        int val;
        bool boolean;
    };
}

void functionOne(void)
{
    using namespace One;
    cout << "functionOne()" << endl;
    cout << "The size of struct Data : ";
    cout << sizeof(Data) << endl;
}

void functionTwo(void)
{
    using namespace Two;
    cout << "functionTwo()" << endl;
    cout << "The size of struct Data : ";
    cout << sizeof(Data) << endl;
}

int main()
{
    functionOne();
    functionTwo();    
} 

Output
functionOne()
The size of struct Data : 8
functionTwo()
The size of struct Data : 8

While when I change the code for 'namespace Two' to the following :

namespace Two
{
    struct Data
    {
        char val;
        bool boolean;
    };
}

Output :

functionOne()
The size of struct Data : 8
functionTwo()
The size of struct Data : 2

I am not able to figure out how the compiler allocates memory to the struct. Thanks in advance.

@CarlNorum The answer in the link says something about 32-bit architecture. Can I know the alignment is (generally) done on 64-bit architectures? — Sumit Gera, Jul 21 '13 at 16:23

Borgleader · Accepted Answer · 2013-07-21T16:31:39.857

The issue here most likely is due to alignment requirements. If I'm not mistaken, the struct is aligned based on the greatest alignment requirement of it's members. In the first version of your struct you have int; char;. It seems on your machine int are aligned at 4 bytes, and so the compiler pads the struct with an extra 3 bytes after the char. In the second version you only have bool; char;, which are both 1 byte in size and aligned to well 1 byte (on your machine) and so the compiler doesn't need to pad anything so the size goes back down to 2.

I specified "on your machine" because this can vary based on several factors.

Let's make a pretty graph!

// One::Data (version 1)
0              4              5                7
[int (size 4), char (size 1), padding (size 3)][...]
// Because of alignment restrictions on int, this needs a padding of 3 bytes

// Two::Data (version 1)
0              4              5                7
[int (size 4), bool (size 1), padding (size 3)][...]
// Because of alignment restrictions on int, this needs a padding of 3 bytes

// One::Data (version 2), no change

// Two::Data (version 2)
0               1             2
[char (size 1), bool (size 1)][...]
// No alignment restrictions, therefore no padding is required

For the changed version of second namespace, why does it allocates 8 bytes? Is 3 bytes used as padding? — Sumit Gera, Jul 21 '13 at 16:19
@mozart I'm looking at your question and the changed version is of size 2? — Borgleader, Jul 21 '13 at 16:23

James Kanze · Answer 2 · 2013-07-21T16:34:36.847

The official answer as to how the compiler allocates memory is "however it wants to". There are a few restrictions, but not many. In this case, however, what you're seeing is quite logical: many types have (or may have) alignment restrictions, and must be placed at an address which is a multiple of some value. And these restrictions propagate up to any class which contains members of the type, since otherwise, you couldn't respect the alignment of the class member. Apparently, on your machine, bool has a size of 1 (and char must have a size of 1), and int has a size of 4, and also must be aligned on an address multiple of 4. So in One::Data and Two::Data, you have an int, followed by a char or a bool, followed by enough bytes of padding to make the total size of the structure a multiple of 4. (In principle, the char/bool and the padding can be mixed in any order, but in practice, every compiler I've seen puts the padding after any declarations.)

Since neither a bool nor a char have any alignment restrictions, there is no need for padding in a class which contains only one of each.

Note that this depends on both the machine and the compiler. On some machines (e.g. Sun Sparc or IBM mainframe), accessing a misaligned value will cause a hardware trap, and the compiler is almost required to align (and insert padding). On Intel, on the other hand, a misaligned access will work, but with a noticeable performance hit; compilers generally force the alignment here (and both the Windows and Linux binary API's require it), but a compiler could conceivably ignore it, and some very early Intel compilers did, back when memory was much tighter than it is now. (It's actually an interesting question as to which gives the most performance on a modern machine. If you have a large array with one of your structures, the extra memory accesses due to misalignment will probably be resolved from the cache, or even from the memory read pipeline, at little extra cost, whereas the smaller size of the object could lead to less cache misses, and thus better performance. But I've done no measures, so I'm just guessing.)

Another point to note is that the standard requires that class members be allocated in order. Technically, only if there's no access specifier between them, but in practice, all compilers always allocate them in order. So if you have a class like:

struct T
{
    double d1;
    char c1;
    double d2;
    char c2;
};

it will (typically) have a size of 32, where as:

struct T
{
    double d1;
    double d2;
    char c1;
    char c2;
};

will only have a size of 24. Back in the days when memory was tight, we regularly paid attention to such things, but now that locality is sometimes an issue, maybe it would pay to do so again: declaring variables in the order of their sizes, with the biggest ones first.

How does compiler allocates memory to this struct?

2 Answers2