C++: initialization of intel SIMD intrinsics class members

Question

I don't understand why the commented and uncommented line don't yield the same result (Linux GCC, with C++11 flag enabled):

#include "immintrin.h"


typedef __m256 floatv;

struct floatv2{
public:

    //floatv2(const float f):x(_mm256_setzero_ps() + f ), y(_mm256_setzero_ps() + f ) {}; // succeeds
    floatv2(const float f):x{_mm256_setzero_ps() + f }, y{_mm256_setzero_ps() + f } {}; // fails

//private:
    floatv x, y;
};

When trying to compile the uncommented line I get the following error:

error: cannot convert ‘__m256 {aka __vector(8) float}’ to ‘float’ in initialization

which I don't understand because x and y are floatv, not float, so no conversion should be required...

Also, in some more complex code, the first version produces memory access violation. Is there something nasty going on behind the scene?

PS: above the definition of __m256, in avxintrin.h, there is the following comment:

/* The Intel API is flexible enough that we must allow aliasing with other
   vector types, and their scalar components.  */

I don't understand what this means, but feel like it could be related :)

Many thanks

Looks like a Bug. BTW, since you're already using _mm256_setzero_ps() there is no reason to add f, just use a broadcast directly with `_mm256_set1_ps()` or with `_mm256_broadcast_ps()`. The only reason to do it the way you're doing it is so that you don't have to use an intrinsic. Define `__m256 zero={}` and do `zero + f` http://stackoverflow.com/questions/21727331/implict-simd-sse-avx-broadcasts-with-gcc — Z boson, Apr 15 '14 at 06:46
Good point; I will definitely use _mm256_set1_ps(f) instead. However, the issue is still here using _mm256_set1_ps. Will report to GCC bugzilla. Many thanks guys — GHL, Apr 17 '14 at 22:31

score 2 · Accepted Answer · answered Apr 18 '14 at 06:11

2

This is related to DR 1467 which did not allow using the list-initialization syntax for copying aggregates. This was recently fixed for classes in GCC and I extended the fix to vectors in r209449. Gcc-4.10 compiles your code.

answered Apr 18 '14 at 06:11

Marc Glisse

7,550
2
30
53

score 0 · Answer 2 · answered Apr 16 '14 at 07:21

0

Probably _mm256_setzero_ps() + f returns a float and not a floatv, because f is a float. So you can't initialize floatv values (x and y) with a float using { }, beacuse {}-initialization doesn't allow narrowing (implicit conversion).

Maybe

x{static_cast<__m256>(_mm256_setzero_ps() + f) }

will work.

answered Apr 16 '14 at 07:21

chiarfe

522
3
15

Hi, this unfortunately fails as well, and x{_mm256_set1_ps(f)} and x{static_cast<__m256>(_mm256_set1_ps(f))} fail too... – GHL Apr 17 '14 at 22:34

C++: initialization of intel SIMD intrinsics class members

2 Answers2