In Theory
How to get the memory granularity of a CPU in C?
First, you read the instruction set architecture manual. It may specify that certain instructions require certain alignments, or even that the addressing forms in certain instructions cannot represent non-aligned addresses. It may specify other properties regarding alignment.
Second, you read the processor manual. It may specify performance characteristics (such as that unaligned loads or stores are supported but may be slower or use more resources than aligned loads or stores) and may specify various options allowed by the instructions set architecture.
Third, you read the operating system documentation. Some architectures allow the operating system to select features related to alignment, such as whether unaligned loads and stores are made to fail or are supported albeit with slower performance than aligned loads or stores. The operating system documentation should have this information.
In Practice
For many programming situations, what you need to know is not the “memory granularity” of a CPU but the alignment requirements of the C implementation you are using (or of whatever language you are using). And, for the most part, you do not need to know the alignment requirements directly but just need to follow the language rules about managing objects—use objects with declared types, do not use casts to convert pointers between incompatible types exceed where specific rules allow it, use the suitably aligned memory as provided by malloc
rather than adjusting your own pointers to bytes, and so on. Following these rules will give good alignment for the objects in your program.
In C, when you define an array, the element size will automatically be the size that C implementation needs for its alignment. For example, long double x[100];
may use 16 bytes for each array element even though the hardware uses only ten bytes for a long double
. Or, for any struct foo
that you define, the compiler will automatically include padding as needed in the structure to give the desired alignment, and any array struct foo x[100];
will already include that padding. sizeof(struct foo)
will be the same as sizeof x[0]
, because each structure object has that padding built in, even just for a single structure object, not just for elements in arrays.
When you do need to know the alignment that a C implementation requires for a type, you can use C’s _Alignof
operator. The expression _Alignof(type)
provides the alignment required for type
.
Other
… properly memory aligned.
Proper alignment is a matter of degrees:
- What the processor supports may determine whether your program works or does not work. An improper alignment is one that causes your program to trap.
- What is efficient with respect to individual loads and stores may affect how fast your program runs. An improper alignment is one that causes your program to execute more slowly.
- In certain performance-critical situations, alignment with respect to cache and memory mapping features can also affect performance.