I am sure I do. The matrix library Eigen3 is likely much faster than any of my own matrix implementations thanks to alignment.
I recently started investigating on alignment using C++. I stumbled on the alignas and alignof functions, and decided to do some tests.
My first finding was that members of a struct get aligned automatically. Let's take the following struct as an example:
struct MyStruct
{
char m_c; // 1 byte
double m_x; // 8 bytes
};
and compare with this one:
struct MyAlignedStruct
{
char m_c; // 1 byte
alignas(8) double m_x; // 8 bytes
};
where I used alignas instead of adding a padding member (char[7]), which, according to my understanding, is equivalent.
Now, the memory viewer for both structs showed the following:
62 00 00 00 8e a8 79 35 00 00 00 00 00 00 10 40 // MyStruct
62 ff ff ff 24 00 00 00 00 00 00 00 00 00 10 40 // MyAlignedStruct
The first byte corresponds to a char ('b'). When using Mystruct, the next 7 bytes are filled with something, and the 8 last bytes represent the double. When using MyAlignedStruct, something very similar occurs. The sizeof() function returns 16 bytes for both structs (I expected 9 bytes for MyStruct).
So here comes my first question: Why do I need alignas if the compiler aligns on its own?
My second finding was that alignas(..) does not speed up my program. My experiment was the following. Imagine the following simple struct:
struct Point
{
double m_x, m_y, m_z;
};
If I fill a vector with instances of that struct, and assuming the first instance is 32-byte aligned, each struct would occupy 24 bytes, and the sequence of bytes would not be 32-byte aligned anymore. Honestly, I'm not sure how speed could be increased by aligning, otherwise, I most probably wouldn't be writing here. Nevertheless, I used alignas to obtain the following struct:
alignas(32) struct Point
{
double m_x, m_y, m_z;
};
Now, contiguous instances of Point would start on a multiple of 32 bytes. I tested both versions: after filling a huge vector with instances of the structs, I summed all the doubles and recorded the time. I found no differences between the 32-byte aligned struct and the other one.
So my second question is the same as my first one: why do I need alignas?