In general the standard places some constraints on the sizeof
a type. Basic constraint is it has to be a multiple of char
whith sizeof(char)
defined as 1
.
For padding bits within a type, refer to 6.2.6.1, which leaves the representation mostly implementation defined. 6.2.6.2p5 states that the value of padding bits is unspecified; there is no need to preserve, but there are two important constraints on the padding bits:
- A positive value in a signed integer shall represent the same value of the same unsigned type. This guarantees compatibility between signed and unsigned variants of the same type for positive values within the range of the signed variant.
- If all bits are zero, this represents the value
0
. So all padding bits have to be 0, too. However, the reverse is not true (thanks to MattMcNabb).
Both include padding bits as they are part of the internal representation. From a more practical view, padding bits should be set to zero unless there are parity, etc. bits which depend on the other bits (yet the 2nd constraint has to be met).
That is a rough interpretation. For details, refer to the rest of cited sections.
On MSP430X, 20 bit int
is of little practical use. They are mostly meant to extend the addressing range, not for integer arithmetics (although the instruction set apparently supports it - I was wrong here in a former edit).
Pointers have a sizeof
32 bits (4 8-bit-bytes), but only use 20 bits. Some embedded compilers might support special short
/near
/... qualifiers, effectively providing two different pointer sizes. This is - however - actually against the standard. (I'm a bit ambivalent here: optimization or portability).
MSP430X is one of the platforms where using the dedicated types from stdint.h
(uintptr_t
) and stddef.h
(e.g. size_t
) is essential, as casting a pointer to/from int
will eventually fail. Even more, the standard's only requirements for (u)intptr_t
(temporary storage, no operations) becomes clear. This way, there is no guarantee anything about the padding bits - even for the null pointer.
Reason for this large overhead (37.5% unused bits) is that the MSP430X has no functions to read/write 20 bit or even 24 bit values (and it would make array-indexing very costly) to/from memory. Only some constants can be 20 bits, as they are encoded in the instruction using an extension word which includes 4 bits and the remaining 16 bits as for other instructions follow the OP-code. This is likely one of the last (small) architectures to show how much additional effort has to be done for address space expansion while maintaining compatibility.
Note that the MSP430X has some additional pitfalls for 20 bit addressing modes. For instance, interrupt handlers` have to reside in the lower 64KiB, as the vector table only contains 16 bit entries. This actually prohibits the vetor table to be defines in C as an array of function pointers (as they cannot be freely converted to any other function pointer and back).