Very simple structure allignment dilemma c/c++

Question

Given that I understand allignment correctly and given we have the following struct:

struct someStruct{
    short i1;
    short i2;
    short i3;
};

Assuming that short is a 16bit integer with 2 byte allignment and that we are using a 32bit machine with x86. I understand that the size of this structure will be of 6 bytes. What I dont understand however is what happens if the struct starts on an odd memory address. Does it add padding to the struct so that it "starts" in an even one instead? Say we have an array of these structs, Would only the first element have this extra padding? Furthermore, does it matter if the starting address is even but it starts halfway through a processor WORD(i.e the smallest readable memory block) or in the beggining of it? Is padding added in any of the latter two circumstances? Again, is this padding only added to the first element of an array of these structs? Would the answers to any of my questions mean that the size of the struct is variable depending on where in memory it is created? Would it mean that some elements of struct arrays of uniform type would have different byte sizes than others?

I also ask if there are any differences specific to this topic between c and c++. And if I could be reminded if it is possible to traverse an array of structures with pointer arithmetic just as you would with arrays of primitive types.

TO CLARIFY AND UPDATE:

As of now, I know the struct can not be stored starting in an odd address, I still however wonder if there is any difference if the address where the struct is stored starts halfway through a WORD(i.e the smallest readable memory block) or at the beggining. Any difference in terms of how it is alligned/stored/padded/fit into arrays/other data structures etc.

TO MAKE IT MORE CONCISE:

is it even possible that a data structure such as this one starts in a memory address halfway through a memory WORD(i.e the smallest readable memory block) in or not in an array?

ILLUSTRATING(THIS IS ANOTHER RECENT UPDATE PLEASE SEE):

after reading some answers, I screenshot wikipedia to show my source of confusion:

This might help you: https://stackoverflow.com/questions/58435348/what-is-bit-padding-or-padding-bits-exactly/58436082#58436082 — NathanOliver, Jul 13 '20 at 20:40
If your computer emplaces such an object in an unaligned location, something is very wrong.. — Asteroids With Wings, Jul 13 '20 at 20:41
*I understand that the size of this structure will be of 6 bytes* - absolutely not necessary. It can be padded in the end, it can be padded at each member — Eugene Sh., Jul 13 '20 at 20:42
"what happens if the struct starts on an odd memory address." It doesn't, unless your manually managing object lifetimes yourself, which is _super_ advanced stuff. — Mooing Duck, Jul 13 '20 at 20:42
Structs have alignment requirements just like scalars. If you do not follow them, the same bad things (potentially) happen that would happen if you mis-aligned a scalar. — Ajay Brahmakshatriya, Jul 13 '20 at 20:46
@EugeneSh. It is however the general rule that a struct like the one I showed will be packed in 6 bytes without extra padding when programming in c/c++ and 32bit x86. — Matias Chara, Jul 13 '20 at 20:46
@Matias "`C/C++` and 32 bit x86" -- what compiler? What optimization flags? — Ajay Brahmakshatriya, Jul 13 '20 at 20:54

score 4 · Answer 1 · answered Jul 13 '20 at 20:42

4

The struct itself has the alignment of 2, so it simply cannot be conformingly created at odd addresses.

answered Jul 13 '20 at 20:42

yuri kilochek

12,709
2
32
59

Important side note: Using placement-new, there are non-conforming ways to do this accidentally :( – Mooing Duck Jul 13 '20 at 20:44
does it matter if it starts halfway through a WORD or in the beggining? Does that change anything? – Matias Chara Jul 13 '20 at 20:44
@MatiasChara: nothing in C++ cares about WORDs, so it doesn't matter. It could be at the start of a word, or the middle, or the end. All that matters is that the compiler will automatically put it at the right alignment for you. – Mooing Duck Jul 13 '20 at 20:44
1

If the processor is fetching 16-bits at a time, it will need to make 2 fetches for a 16-bit value at an odd address. – Thomas Matthews Jul 13 '20 at 20:46
@MatiasChara if by `WORD` you mean a two-byte integer, then the answer is that it cannot start halfway, because such integer also has an alignment of 2 and can only start at even addresses. – yuri kilochek Jul 13 '20 at 20:47
1

No, by word I mean the other definition of word(confusing right) meaning the memory block load size.(4 bytes in 32 bit) – Matias Chara Jul 13 '20 at 20:47
@MooingDuck im not specifically asking if c++/c cares about WORDS(**i.e the smallest readable memory block by the cpu**), im asking how the structure is stored in memory. As of now, I know it can not be stored starting in an odd address, I still however wonder if there is any difference if the address where the struct is stored starts halfway through a WORD(see definition above) or at the beggining. Any difference in terms of how it is alligned/stored/padded/fit into arrays/other data structures etc. – Matias Chara Jul 13 '20 at 21:00
A type has constant alignment and constant size, it can't change . Alignment of 2 generally means that a variable of that type can be placed at address dividable by 2. It's hard to follow what exactly are you asking. – KamilCuk Jul 13 '20 at 21:05
2

The struct's alignment requirement must be at least as large as the largest alignment requirement of any member, but it *could* be larger. This is a function of the C implementation, not (directly) the hardware. Historically, some implementations have assigned *all* structure types 4-(or 8- or 16-)byte alignment. In all cases, the implementation manages both the alignment requirement itself and the laying out of data to satisfy that requirement. If you let it do its job then it will make choices that work. – John Bollinger Jul 13 '20 at 21:17

score 2 · Accepted Answer · answered Jul 13 '20 at 22:03

Assuming that short is a 16bit integer with 2 byte allignment and that we are using a 32bit machine with x86. I understand that the size of this structure will be of 6 bytes.

Not necessarily. The implementation is at liberty to include padding after any or all members, at its discretion. Implementations typically make such decisions based on alignment considerations, but they are not bound to that nor to any particular formula.

The alignment requirement for your structure must be at least as large as the largest alignment requirement of any member, but that does not mean a whole lot because the C (or C++) implementation makes its own choices about the alignment requirements of scalar types, and because it is free to choose larger alignment requirements for aggregate and union types than is necessary to satisfy the alignment requirements of their members. Historically, some implementations have done so under various circumstances. Thus, even if we assume that your implementation adds padding only for alignment purposes, your structure might still be larger than six bytes.

Implementations typically adhere to an established application binary interface, which will specify data alignment and layout rules, but doing so is a means to an end (binary compatibility), not a language requirement.

What I dont understand however is what happens if the struct starts on an odd memory address.

If the structure type has an alignment requirement of at least two then it won't start at an odd address unless you somehow force it by one or another flavor of pointer trickery. If you do so force a misalignment then the behavior of accessing the structure through the misaligned pointer is undefined. In practice, among the more likely behaviors in general are (i) it just works, (ii) it works but accesses are slowed, and (iii) accesses cause a runtime signal to be raised.

Does it add padding to the struct so that it "starts" in an even one instead?

Padding is a characteristic of the type not of instances, and the first byte of the type is never a padding byte. Rather, supposing you let the implementation allocate the object, it will align the allocation correctly for the type. The same applies in C++ if you use the ordinary new operator (not placement new), and if you allocate memory manually with malloc() then the beginning of the allocated space is guaranteed to be properly aligned for any type. This may mean that there is space preceding an instance that is not attributed to any object, but that does not constitute "padding" in the conventional sense of the term.

Since arrays are laid out as a contiguous sequence of objects without gaps, and the size of each object is a multiple of its alignment requirement, it follows that as long as the first element of an array is correctly aligned for its type, so will be all the subsequent elements.

Furthermore, does it matter if the starting address is even but it starts halfway through a processor WORD(i.e the smallest readable memory block) or in the beggining of it?

It shouldn't matter to you. If it matters to the hardware or to the C (C++) implementation itself then it is the implementation's responsibility to take that properly into account.

Would the answers to any of my questions mean that the size of the struct is variable depending on where in memory it is created? Would it mean that some elements of struct arrays of uniform type would have different byte sizes than others?

No and no. The size and alignment requirement of every type are fixed characteristics of the type. They do not vary from instance to instance. The required relationship between these characteristics (that the size is a multiple of the alignment requirement) helps to ensure that neither needs to vary. That they do not vary relieves the implementation from tracking instance-level metadata, which would be wasteful.

This also means that pointer arithmetic and array indexing (which are fundamentally the same thing) work for arrays of structure type. You can use either mechanism to access array members, details of the element type notwithstanding.

I also ask if there are any differences specific to this topic between c and c++.

C++ has a richer type system than does C, but the parts that are congruent have substantially the same rules.

is it even possible that a data structure such as this one starts in a memory address halfway through a memory WORD(i.e the smallest readable memory block) in or not in an array?

Neither C nor C++ forbids it. In fact, they are not concerned with the question at all. It is up to implementations to make that determination, and to some extent, it is possible for different implementations targeting the same operating environment to make different choices.

Mooing Duck · Answer 3 · 2020-07-13T21:56:10.127

Since the alignment of the struct is 2, the compiler will never place it at an odd address, only multiples of 2, so you (almost) never have to worry about aligning. In some cases (not your sample), it may add padding between members to make sure each member is properly aligned, and/or it may add padding at the end so that if the object were to be put in an array, all subsequent elements would automatically be aligned, but I know of no reason a compiler would ever put padding at the start of a struct. Normal compile time arrays do not need any invisible padding on top of the struct padding.

Nothing in C++ cares about WORDs, so it doesn't matter. structs and primitives can be at the start of a word, or the middle, or the end, or span multiple words. They do not affect each other in any way. All that matters is that the compiler will automatically put it at the right alignment for you.

In your array case, yes, an array of 2 structs could be offset from the word size slightly.

structs:       [i1  ][i2  ][i3  ][i1  ][i2  ][i3  ]
words:   [          ][          ][          ][          ]
bytes:   [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]

That's totally valid. The CPU never works on entire structs in a single operation, only on individual primitives, so if it's trying to access the i3 member of the first struct, it would merely load up the second word and use the bytes it cares about. You can absolutely use pointer math to iterate through this array of structs just like any array of primitives, 100% the same.

I am not aware of any differences between C and C++ in this area.

@MatiasChara: No. My example there was specifically an array of structs — Mooing Duck, Jul 13 '20 at 21:50
Re “The CPU never works on entire structs in a single operation”: There are various circumstances in which a CPU will work on an entire structure in a single operation. If the structure is small enough to fit in a register, the compiler may handle an assignment of the structure by loading all of it into a register and storing it to the new memory location. Some ABIs call for passing small structures in single registers. On machines with SIMD, a compiler might optimize various arithmetic operations on all of a structure’s elements to a vector arithmetic instruction. — Eric Postpischil, Jul 14 '20 at 00:08

Very simple structure allignment dilemma c/c++

3 Answers3