Object/Struct Alignment in C/C++

Question

#include <iostream>

using namespace std;

struct test
{
    int i;
    double h;
    int j;
};

int main()
{
    test te;
    te.i = 5;
    te.h = 6.5;
    te.j = 10;

    cout << "size of an int: " << sizeof(int) << endl; // Should be 4
    cout << "size of a double: " << sizeof(double) << endl; //Should be 8
    cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)

    //These two should be the same
    cout << "start address of the object: " << &te << endl; 
    cout << "address of i member: " << &te.i << endl;

    //These two should be the same
    cout << "start address of the double field: " << &te.h << endl;
    cout << "calculate the offset of the double field: " << (&te + sizeof(double)) << endl; //NOT THE SAME

    return 0;    
}

Output:

size of an int: 4
size of a double: 8
size of test: 24
start address of the object: 0x7fffb9fd44e0
address of i member: 0x7fffb9fd44e0
start address of the double field: 0x7fffb9fd44e8
calculate the offset of the double field: 0x7fffb9fd45a0

Why do the last two lines produce different values? Something I am doing wrong with pointer arithmetic?

`(&te + sizeof(double))` should be `(&te + sizeof(int))` (or the size of all elements prior to the one you want the offset for ;) — Captain Obvlious, Dec 19 '11 at 21:08
I would think this object would be padded by 8 for each field, as seen by the total size of test being 24. — Alex, Dec 19 '11 at 21:09
It is 64 bits. Each field would be 8 bytes I'd think, as the size of the largest field is 8, and I didn't instruct the compiler to pack it. — Alex, Dec 19 '11 at 21:13

score 8 · Accepted Answer · edited Dec 19 '11 at 21:20

8

(&te + sizeof(double))

This is the same as:

&((&te)[sizeof(double)])

You should do:

(char*)(&te) + sizeof(int)

edited Dec 19 '11 at 21:20

Keith Thompson

254,901
44
429
631

answered Dec 19 '11 at 21:10

Piotr Praszmo

17,928
1
57
65

What do you mean by "byte*"? That isn't a valid keyword in C++, so I assume that is shorthand. – Alex Dec 19 '11 at 21:15
You can use `unsigned char*` or `size_t`. Keep in mind that the comments about alignment are also true and you cannot really rely on this behavior. – Piotr Praszmo Dec 19 '11 at 21:20
Does that really work? Doesn't that just have you go forward sizeof(char*) * sizeof(int) bytes? – Alex Dec 19 '11 at 22:12
It should go `sizeof(char)*sizeof(int)` bytes forward. And `sizeof(char)` should be 1. – Piotr Praszmo Dec 19 '11 at 22:22
This works, but it is sizeof(double), not size of int. One annoying thing is, that when you make it a char* (and try to pass it to std::out) it prints a blank line. I just casted it back to a test*: (test*)(((char*)(&te) + sizeof(double))) – Alex Dec 19 '11 at 23:18

score 4 · Answer 2 · answered Dec 19 '11 at 21:12

4

You are correct -- the problem is with pointer arithmetic.

When you add to a pointer, you increment the pointer by a multiple of that pointer's type

Therefore, &te + 1 will be 24 bytes after &te.

Your code &te + sizeof(double) will add 24 * sizeof(double) or 192 bytes.

answered Dec 19 '11 at 21:12

Drew Dormann

59,987
13
123
180

It's not clear to me what you want. If you want to calculate the offset of that member variable, `offsetof(test,h)` will work. – Drew Dormann Dec 19 '11 at 21:31
Just playing around with serialization/deserialization. Trying to see how I would calculate fields positions if I have access to a void*, or something like that. – Alex Dec 19 '11 at 21:41
Then `offsetof(test,h)` will tell you. – Drew Dormann Dec 19 '11 at 21:45
Cool, but how do I actually traverse to that location? – Alex Dec 19 '11 at 22:10

Oliver Charlesworth · Answer 3 · 2011-12-19T21:49:26.217

3

Firstly, your code is wrong, you'd want to add the size of the fields before h (i.e. an int), there's no reason to assume double. Second, you need to normalise everything to char * first (pointer arithmetic is done in units of the thing being pointed to).

More generally, you can't rely on code like this to work. The compiler is free to insert padding between fields to align things to word boundaries and so on. If you really want to know the offset of a particular field, there's an offsetof macro that you can use. It's defined in <stddef.h> in C, <cstddef> in C++.

Most compilers offer an option to remove all padding (e.g. GCC's __attribute__ ((packed))).

_{I believe it's only well-defined to use offsetof on POD types.}

edited Dec 19 '11 at 21:49

answered Dec 19 '11 at 21:09

Oliver Charlesworth

267,707
33
569
680

1

I've seen code using `__attribute__ ((packed))` blow up when you try to use the address of a misaligned member. – Keith Thompson Dec 19 '11 at 21:21
1

@KeithThompson: I would hope that the compiler wouldn't be stupid enough to generate code that would fail. If the platform doesn't support misaligned accesses, I would want the compiler to do either (a) fail to compile, or (b) do the accesses the longwinded way... – Oliver Charlesworth Dec 19 '11 at 21:29
Definitely want to add the size of the largest field (double == 8 in my case) as these are word aligned. – Alex Dec 19 '11 at 21:31
1

@windfinder: But as I pointed out, code like is fundamentally not to be relied on, so it doesn't really matter... – Oliver Charlesworth Dec 19 '11 at 21:32
@Oli Charlesworth -- Fair enough, but still inaccurate. – Alex Dec 19 '11 at 21:35
2

@OliCharlesworth: I hoped so too, but I was disappointed. See [this program](https://gist.github.com/7ccb91c3b015ec128395). The problem is that once you take the address of a misaligned member, the compiler has no way of knowing that it's misaligned. – Keith Thompson Dec 19 '11 at 21:41
2

@windfinder: It's not inaccurate; there is no reason to assume that the compiler is using `sizeof(double)` as its alignment... – Oliver Charlesworth Dec 19 '11 at 21:42
@Oli Charlesworth: The part about using int was inaccurate, which you just removed. At the time you wrote the comment you still had that in the post. – Alex Dec 19 '11 at 21:43
@Keith: That's an interesting case I hadn't considered; I wonder if there's anything here on SO about that (it would form the basis of an interesting question). – Oliver Charlesworth Dec 19 '11 at 21:45
Using `__attribute__((packed))` or its equivalent is similar to tearing off all those "Warranty void if this label removed" labels on electronic equipment. Once you've started mucking around at that level, you're responsible for any weirdness you encounter (and, yes, sometimes you have to muck around at that level, but it's -- thankfully -- rare nowadays). – Max Lybbert Dec 19 '11 at 21:59
1

@MaxLybbert: Yes, it would seem so. You would hope that the GCC docs would say that, though! – Oliver Charlesworth Dec 19 '11 at 22:00
@OliCharlesworth: The gcc documentation doesn't mention this issue. I guess after 543 answers it's about time to post [my first question](http://stackoverflow.com/questions/8568432/is-gccs-attribute-packed-unsafe/8568441#8568441). – Keith Thompson Dec 19 '11 at 22:31

score 2 · Answer 4 · answered Dec 19 '11 at 21:08

2

struct test
{
    int i;
    int j;
    double h;
};

Since your largest data type is 8 bytes, the struct adds padding around your ints, either put the largest data type first, or think about the padding on your end! Hope this helps!

answered Dec 19 '11 at 21:08

Nico

3,826
1
21
31

Or just don't worry about it. Declaring the members in a logical order may be worth the cost of extra padding -- especially since the padding is going to vary from one system to another. – Keith Thompson Dec 19 '11 at 21:25
Playing around with my compiler I don't see this rearrangement. Are there specific compilers/settings that trigger this? Currently using G++. – Alex Dec 19 '11 at 21:38
1

@windfinder: I'm not sure what you mean. The language guarantees that members will be allocated in the order in which they're declared, possibly with padding. The answer suggests *manually* rearranging the members in your source code to avoid padding. It's a valid suggestion; my comment suggests that it might or might not be worthwhile. – Keith Thompson Dec 19 '11 at 22:37

score 2 · Answer 5 · answered Dec 19 '11 at 21:11

2

&te + sizeof(double) is equivalent to &te + 8, which is equivalent to &((&te)[8]). That is — since &te has type test *, &te + 8 adds eight times the size of a test.

answered Dec 19 '11 at 21:11

ruakh

175,680
26
273
307

Right, this is what Drew is saying as well. What is the correct syntax then? Thanks! – Alex Dec 19 '11 at 21:21
@windfinder: Well, you already saw that `&te.h` gives the answer you want . . . or are you looking for something different? I'm a bit unclear on what you're trying to do, sorry. :-/ – ruakh Dec 19 '11 at 21:31
Just playing around with serialization/deserialization. Trying to see how I would calculate fields positions if I have access to a void*, or something like that. – Alex Dec 19 '11 at 21:39

score 1 · Answer 6 · edited May 23 '17 at 11:52

1

Compilers are free to space out structs however they want past the first member, and usually use padding to align to word boundaries for speed.

See these:
C struct sizes inconsistence
Struct varies in memory size?
et. al.

edited May 23 '17 at 11:52

Community

1
1

answered Dec 19 '11 at 21:11

Kevin

53,822
15
101
132

Not quite *however* they want. The first member is always at offset 0, and the members are always laid out in declared order (at least for C and, in C++, for POD types). – Keith Thompson Dec 19 '11 at 21:25
@KeithThompson Yes, yes. Rephrased. – Kevin Dec 19 '11 at 21:29

score 1 · Answer 7 · answered Dec 19 '11 at 21:18

You can see what's going on more clearly using the offsetof() macro:

#include <iostream>
#include <cstddef>

using namespace std;

struct test
{
    int i;
    double h;
    int j;
};

int main()
{
    test te;
    te.i = 5;
    te.h = 6.5;
    te.j = 10;

    cout << "size of an int:   " << sizeof(int)    << endl; // Should be 4
    cout << "size of a double: " << sizeof(double) << endl; // Should be 8
    cout << "size of test:     " << sizeof(test)   << endl; // Should be 24 (word size of 8 for double)

    cout << "i: size = " << sizeof te.i << ", offset = " << offsetof(test, i) << endl;
    cout << "h: size = " << sizeof te.h << ", offset = " << offsetof(test, h) << endl;
    cout << "j: size = " << sizeof te.j << ", offset = " << offsetof(test, j) << endl;

    return 0;
}

On my system (x86), I get the following output:

size of an int:   4
size of a double: 8
size of test:     16
i: size = 4, offset = 0
h: size = 8, offset = 4
j: size = 4, offset = 12

On another system (SPARC), I get:

size of an int:   4
size of a double: 8
size of test:     24
i: size = 4, offset = 0
h: size = 8, offset = 8
j: size = 4, offset = 16

The compiler will insert padding bytes between struct members to ensure that each member is aligned properly. As you can see, alignment requirements vary from system to system; on one system (x86), double is 8 bytes but only requires 4-byte alignment, and on another system (SPARC), double is 8 bytes and requires 8-byte alignment.

Padding can also be added at the end of a struct to ensure that everything is aligned properly when you have an array of the struct type. On SPARC, for example, the compile adds 4 bytes pf padding at the end of the struct.

The language guarantees that the first declared member will be at an offset of 0, and that members are laid out in the order in which they're declared. (At least that's true for simple structs; C++ metadata might complicate things.)

Object/Struct Alignment in C/C++

7 Answers7