I wish to ensure SSE is used for arithmetic on my 3D (96 bit) float vectors. However, I have read conflicting views on just what is necessary.
Some articles/posts say I need to use a 4D vector and "ignore" the 4th element, some say I must decorate my class with things like __declspec(align(16))
and override the new
operator, and some say the compiler is clever enough to align things for me (I really hope this is true!).
I am using the Eigen library, but find that the "unsupported" AlignedVector3
class isn't fit for purpose (e.g. division by zero errors when doing component-wise division, lpNorm
function includes the dummy 4th element).
A lot of the articles I've read are several years old now, so I hold out hope that modern compilers/SSE versions/CPUs can just align the data for me, or work with non-16 byte aligned data. Any up to date knowledge on this will be much appreciated!