Assuming that short is a 16bit integer with 2 byte allignment and that
we are using a 32bit machine with x86. I understand that the size of
this structure will be of 6 bytes.
Not necessarily. The implementation is at liberty to include padding after any or all members, at its discretion. Implementations typically make such decisions based on alignment considerations, but they are not bound to that nor to any particular formula.
The alignment requirement for your structure must be at least as large as the largest alignment requirement of any member, but that does not mean a whole lot because the C (or C++) implementation makes its own choices about the alignment requirements of scalar types, and because it is free to choose larger alignment requirements for aggregate and union types than is necessary to satisfy the alignment requirements of their members. Historically, some implementations have done so under various circumstances. Thus, even if we assume that your implementation adds padding only for alignment purposes, your structure might still be larger than six bytes.
Implementations typically adhere to an established application binary interface, which will specify data alignment and layout rules, but doing so is a means to an end (binary compatibility), not a language requirement.
What I dont understand however is what happens if the struct starts on an odd memory address.
If the structure type has an alignment requirement of at least two then it won't start at an odd address unless you somehow force it by one or another flavor of pointer trickery. If you do so force a misalignment then the behavior of accessing the structure through the misaligned pointer is undefined. In practice, among the more likely behaviors in general are (i) it just works, (ii) it works but accesses are slowed, and (iii) accesses cause a runtime signal to be raised.
Does it add padding to the struct so that it "starts" in an even one instead?
Padding is a characteristic of the type not of instances, and the first byte of the type is never a padding byte. Rather, supposing you let the implementation allocate the object, it will align the allocation correctly for the type. The same applies in C++ if you use the ordinary new
operator (not placement new
), and if you allocate memory manually with malloc()
then the beginning of the allocated space is guaranteed to be properly aligned for any type. This may mean that there is space preceding an instance that is not attributed to any object, but that does not constitute "padding" in the conventional sense of the term.
Since arrays are laid out as a contiguous sequence of objects without gaps, and the size of each object is a multiple of its alignment requirement, it follows that as long as the first element of an array is correctly aligned for its type, so will be all the subsequent elements.
Furthermore, does it matter if the starting address is even but it starts halfway through a processor WORD(i.e the smallest readable memory block) or in the beggining of it?
It shouldn't matter to you. If it matters to the hardware or to the C (C++) implementation itself then it is the implementation's responsibility to take that properly into account.
Would the answers to any of my questions mean that the size of the struct is variable depending on where in memory it is created? Would it mean that some elements of struct arrays of uniform type would have different byte sizes than others?
No and no. The size and alignment requirement of every type are fixed characteristics of the type. They do not vary from instance to instance. The required relationship between these characteristics (that the size is a multiple of the alignment requirement) helps to ensure that neither needs to vary. That they do not vary relieves the implementation from tracking instance-level metadata, which would be wasteful.
This also means that pointer arithmetic and array indexing (which are fundamentally the same thing) work for arrays of structure type. You can use either mechanism to access array members, details of the element type notwithstanding.
I also ask if there are any differences specific to this topic between c and c++.
C++ has a richer type system than does C, but the parts that are congruent have substantially the same rules.
is it even possible that a data structure such as this one starts in a memory address halfway through a memory WORD(i.e the smallest readable memory block) in or not in an array?
Neither C nor C++ forbids it. In fact, they are not concerned with the question at all. It is up to implementations to make that determination, and to some extent, it is possible for different implementations targeting the same operating environment to make different choices.