1

I don't understand why the commented and uncommented line don't yield the same result (Linux GCC, with C++11 flag enabled):

#include "immintrin.h"


typedef __m256 floatv;

struct floatv2{
public:

    //floatv2(const float f):x(_mm256_setzero_ps() + f ), y(_mm256_setzero_ps() + f ) {}; // succeeds
    floatv2(const float f):x{_mm256_setzero_ps() + f }, y{_mm256_setzero_ps() + f } {}; // fails

//private:
    floatv x, y;
};

When trying to compile the uncommented line I get the following error:

error: cannot convert ‘__m256 {aka __vector(8) float}’ to ‘float’ in initialization

which I don't understand because x and y are floatv, not float, so no conversion should be required...

Also, in some more complex code, the first version produces memory access violation. Is there something nasty going on behind the scene?

PS: above the definition of __m256, in avxintrin.h, there is the following comment:

/* The Intel API is flexible enough that we must allow aliasing with other
   vector types, and their scalar components.  */

I don't understand what this means, but feel like it could be related :)

Many thanks

GHL
  • 572
  • 2
  • 9
  • Bug, please report to gcc's bugzilla. – Marc Glisse Apr 14 '14 at 21:45
  • 1
    Looks like a Bug. BTW, since you're already using _mm256_setzero_ps() there is no reason to add f, just use a broadcast directly with `_mm256_set1_ps()` or with `_mm256_broadcast_ps()`. The only reason to do it the way you're doing it is so that you don't have to use an intrinsic. Define `__m256 zero={}` and do `zero + f` http://stackoverflow.com/questions/21727331/implict-simd-sse-avx-broadcasts-with-gcc – Z boson Apr 15 '14 at 06:46
  • Good point; I will definitely use _mm256_set1_ps(f) instead. However, the issue is still here using _mm256_set1_ps. Will report to GCC bugzilla. Many thanks guys – GHL Apr 17 '14 at 22:31

2 Answers2

2

This is related to DR 1467 which did not allow using the list-initialization syntax for copying aggregates. This was recently fixed for classes in GCC and I extended the fix to vectors in r209449. Gcc-4.10 compiles your code.

Marc Glisse
  • 7,550
  • 2
  • 30
  • 53
0

Probably _mm256_setzero_ps() + f returns a float and not a floatv, because f is a float. So you can't initialize floatv values (x and y) with a float using { }, beacuse {}-initialization doesn't allow narrowing (implicit conversion).

Maybe

x{static_cast<__m256>(_mm256_setzero_ps() + f) }

will work.

chiarfe
  • 522
  • 3
  • 15
  • Hi, this unfortunately fails as well, and x{_mm256_set1_ps(f)} and x{static_cast<__m256>(_mm256_set1_ps(f))} fail too... – GHL Apr 17 '14 at 22:34