I think this is a cross-language duplicate of Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?.
As I explained over there, Intel defined the C/C++ intrinsics API such that loadu
/ storeu
can safely dereference an under-aligned pointer, and that it's safe to create such pointers, even though it's UB in ISO C++ even to create under-aligned pointers. (Thus implementations that provide the intrinsics API must define the behaviour).
The Rust version should work identically. Implementations that provide it must make it safe to create under-aligned __m128i*
pointers, as long as you don't dereference them "manually".
The other API-design option would be to have another version of the type that doesn't imply 16-byte alignment, like a __m128i_u
or something. GNU C does this with their native vector syntax, but that's way off topic for Rust.