11

Summary: How does the compiler statically determine the size of a C++ class during compilation?

Details:

I'm trying to understand what the rules are for determining how much memory a class will use, and also how the memory will be aligned.

For example the following code declares 4 classes. The first 2 are each 16 bytes. But the 3 is 48 bytes, even though it contains the same data members as the first 2. While the fourth class has the same data members as the third, just in a different order, but it is 32 bytes.

#include <xmmintrin.h>
#include <stdio.h>

class TestClass1 {
  __m128i vect;
};

class TestClass2 {
  char buf[8];
  char buf2[8];
};

class TestClass3 {
  char buf[8];
  __m128i vect;
  char buf2[8];
};

class TestClass4 {
  char buf[8];
  char buf2[8];
  __m128i vect;
};


TestClass1 *ptr1;
TestClass2 *ptr2;
TestClass3 *ptr3;
TestClass4 *ptr4;
int main() {
  ptr1 = new TestClass1();
  ptr2 = new TestClass2();
  ptr3 = new TestClass3();
  ptr4 = new TestClass4();
  printf("sizeof TestClass1 is: %lu\t TestClass2 is: %lu\t TestClass3 is: %lu\t TestClass4 is: %lu\n", sizeof(*ptr1), sizeof(*ptr2), sizeof(*ptr3), sizeof(*ptr4));
  return 0;
}

I know that the answer has something to do with alignment of the data members of the class. But I am trying to understand exactly what these rules are and how they get applied during the compilation steps because I have a class that has a __m128i data member, but the data member is not 16-byte aligned and this results in a segfault when the compiler generates code using movaps to access the data.

Gabriel Southern
  • 9,602
  • 12
  • 56
  • 95
  • 2
    The rules are completely implementation-defined, you should check the documentation of your compiler. Also, almost any compiler allows you to specify the required alignment for a class member using some `#pragma` or attribute. – Matteo Italia Jan 24 '13 at 21:05
  • "the data member is not aligned": Then you did something unpleasant with pointer/reference casts, or custom allocation, or similar. – aschepler Jan 24 '13 at 21:06

4 Answers4

15

For POD (plain old data), the rules are typically:

  • Each member in the structure has some size s and some alignment requirement a.
  • The compiler starts with a size S set to zero and an alignment requirement A set to one (byte).
  • The compiler processes each member in the structure in order:
  1. Consider the member’s alignment requirement a. If S is not currently a multiple of a, then add just enough bytes to S so that it is a multiple of a. This determines where the member will go; it will go at offset S from the beginning of the structure (for the current value of S).
  2. Set A to the least common multiple1 of A and a.
  3. Add s to S, to set aside space for the member.
  • When the above process is done for each member, consider the structure’s alignment requirement A. If S is not currently a multiple of A, then add just enough to S so that it is a multiple of A.

The size of the structure is the value of S when the above is done.

Additionally:

  • If any member is an array, its size is the number of elements multiplied by the size of each element, and its alignment requirement is the alignment requirement of an element.
  • If any member is a structure, its size and alignment requirement are calculated as above.
  • If any member is a union, its size is the size of its largest member plus just enough to make it a multiple of the least common multiple1 of the alignments of all the members.

Consider your TestClass3:

  • S starts at 0 and A starts at 1.
  • char buf[8] requires 8 bytes and alignment 1, so S is increased by 8 to 8, and A remains 1.
  • __m128i vect requires 16 bytes and alignment 16. First, S must be increased to 16 to give the correct alignment. Then A must be increased to 16. Then S must be increased by 16 to make space for vect, so S is now 32.
  • char buf2[8] requires 8 bytes and alignment 1, so S is increased by 8 to 24, and A remains 16.
  • At the end, S is 24, which is not a multiple of A (16), so S must be increased by 8 to 32.

So the size of TestClass3 is 32 bytes.

For elementary types (int, double, et cetera), the alignment requirements are implementation-defined and are usually largely determined by the hardware. On many processors, it is faster to load and store data when it has a certain alignment (usually when its address in memory is a multiple of its size). Beyond this, the rules above follow largely from logic; they put each member where it must be to satisfy alignment requirements without using more space than necessary.

Footnote

1 I have worded this for a general case as using the least common multiple of alignment requirements. However, since alignment requirements are always powers of two, the least common multiple of any set of alignment requirements is the largest of them.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Also note that _often though not always_, rearranging members from biggest to smallest will minimize the wasted padding space. – Mooing Duck Jan 24 '13 at 21:31
  • It should be noted that this is not specified by C++. It's specified by the C++ ABI for the platform/hardware. – bames53 Jan 24 '13 at 22:10
8

It is entirely up to the compiler how the size of a class is determined. A compiler will usually compile to match a certain application binary interface, which is platform dependent.

The behaviour you've observed, however, is pretty typical. The compiler is trying to align the members so that they each begin at a multiple of their size. In the case of TestClass3, the one of the members is of type __m128i and sizeof(__m128i) == 16. So it will try to align that member to begin at a byte that is a multiple of 16. The first member is of type char[8] so takes up 8 bytes. If the compiler were to place the _m128i object directly after this first member, it would start at position 8, which is not a multiple of 16:

0               8               16              24              32              48
┌───────────────┬───────────────────────────────┬───────────────┬┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄
│    char[8]    │            __m128i            │    char[8]    │           
└───────────────┴───────────────────────────────┴───────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄

So instead it prefers to do this:

0               8               16              24              32              48
┌───────────────┬┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┬───────────────────────────────┬───────────────┐┄┄┄
│    char[8]    │               │           __m128i             │    char[8]    │
└───────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┴───────────────────────────────┴───────────────┘┄┄┄

This gives it a size of 48 bytes.

When you reorder the members to get TestClass4 the layout becomes:

0               8               16              24              32              48
┌───────────────┬───────────────┬───────────────────────────────┬┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄
│    char[8]    │    char[8]    │           __m128i             │        
└───────────────┴───────────────┴───────────────────────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄

Now everything is correctly aligned - the arrays are at offsets that are multiple of 1 (the size of their elements) and the __m128i object is at an offset that is a multiple of 16 - and the total size is 32 bytes.

The reason the compiler doesn't just do this rearrangement itself is because the standard specifies that later members of a class should have higher addresses:

Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object.

Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
0

The rules are set in stone by the Application Binary Interface specification in use, which ensures compatibility between different systems for programs sharing this interface.

For GCC, this is the Itanium ABI.

(Unfortunately it is no longer publicly available, though I did find a mirror.)

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • @EricPostpischil: http://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html `The GNU C++ compiler uses an industry-standard C++ ABI starting with version 3. Details can be found in the ABI specification.` linking specifically to the code-sourcery page – Lightness Races in Orbit Jan 24 '13 at 21:38
  • I suppose I should have talked about the Code Sourcery set of ABIs rather than their Itanium one in particular – Lightness Races in Orbit Jan 24 '13 at 21:39
-1

if you want to ensure the allignment you should use the "pragma pack(1)" in your h file look at this post: http://tedlogan.com/techblog2.html

sashas
  • 531
  • 1
  • 3
  • 14
  • That prevents padding, which actually means that most probably several data members will be misaligned. – Matteo Italia Jan 26 '13 at 00:29
  • That would be `#pragma pack(2)` - and it would still be non-optimal for bigger-than-16 bit stuff on most architectures (e.g. on x86 you would get slower access to regular `int`s, on other architectures even a hardware exception). – Matteo Italia Jan 26 '13 at 12:22