1

I try to represent a __int16 array as a __m128i element. Casting __m128i element to __int16 array works nicely. My example code:

void example() {
   __m128i v = _mm_set_epi16(1, 2, 3, 4, 5, 6, 7, 8);
   __int16 *p_i = (__int16 *)&v; 
   for (int i = 0; i < 8; i++)
       std::cout <<p_i[i] << " "; // 8 7 6 5 4 3 2 1
   std::cout << "\n";

   __int16 i2[8] = {1, 2, 3, 4, 5, 6, 7, 8};
   __m128i *p_v2 = (__m128i *) i2;
   std::cout << __m128i_toString<__int16>(p_v2[0])<< "\n"; //error here
}

__m128i_toString<>() from this

What did I miss?

Community
  • 1
  • 1
Stepan Loginov
  • 1,667
  • 4
  • 22
  • 49
  • What error do you get ? It works fine for me (after I change `__int16` to `int16_t` and add the necessary `#include`s). – Paul R Mar 22 '16 at 17:42
  • It's runtime error. "unhandled exception at '0x000488d9' in IntelHi.exe: 0xC000005: access violation when reading '0xfffffff'" – Stepan Loginov Mar 22 '16 at 18:22
  • 3
    Oh - probably just alignment then - try aligning your data - add `__attribute__((aligned(16)))` to your `__int16` array declaration. – Paul R Mar 22 '16 at 18:25
  • look's like Intel compiler doesn't support `__attribute__` I will try find some equivalent – Stepan Loginov Mar 22 '16 at 18:32
  • `__declspec(align(16))` looks same. I added it before `__int16` array data and it works. Thanks a lot. But i can't understand what i am doing now. – Stepan Loginov Mar 22 '16 at 18:41
  • https://msdn.microsoft.com/en-us/library/83ythb65.aspx – Stepan Loginov Mar 22 '16 at 18:44
  • I think the Intel compiler supports `__attribute__` on Linux, but maybe you're using the Windows version ? `__declspec(align(16))` serves much the same purpose, but it's mainly a Windows thing. – Paul R Mar 22 '16 at 23:11
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/107079/discussion-between-stepan-loginov-and-paul-r). – Stepan Loginov Mar 22 '16 at 23:17

1 Answers1

3

In C++11, you can use alignas(16) int16_t i2[8] = ... to get 16B-alignment in a portable way without any compiler-specific extensions like __attribute__((aligned(16))) or __declspec(align(16)).

See the code on godbolt compiled with alignas.

Note that you should generally avoid aliasing __m128i with short integer arrays of the same length. Getting data into vectors that way causes stalls from failed store-forwarding. Doing horizontal operations by storing to an array and then processing with scalar code also sucks compared to SIMD.

Using _mm_set_epi16() will probably lead to better code, because the compiler doesn't have to optimize away the actual array and pointer operations. In this case, it was able to (clang just does a movaps from a read-only constant, with no storing to an array first). If the initializer wasn't a compile-time constant, you might not get such good results.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847