In a simd-tutorial i found the following code-snippet.
void simd(float* a, int N)
{
// We assume N % 4 == 0.
int nb_iters = N / 4;
__m128* ptr = reinterpret_cast<__m128*>(a); // (*)
for (int i = 0; i < nb_iters; ++i, ++ptr, a += 4)
_mm_store_ps(a, _mm_sqrt_ps(*ptr));
}
Now my question is, is the line with (*) undefined behaviour? Due to the following spec from (https://en.cppreference.com/w/cpp/language/reinterpret_cast)
Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:
- AliasedType and DynamicType are similar.
- AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
- AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.
How could someone prevent undefined behaviour in this case? Im aware of that i could std::memcopy, but the performance penalty would made the simd useless or am i'm wrong on this?