3

Let's assume i have defined __m256d x and that I want to extract the lower 128-bits. I would do:

__m128d xlow = _mm256_castpd256_pd128(x);

However, I recently saw someone do:

__m128d xlow = (__m128d) x

Is there a prefered method to use for the cast? Why use the first method?

Mysticial
  • 464,885
  • 45
  • 335
  • 332
Z boson
  • 32,619
  • 11
  • 123
  • 226
  • 3
    The second one is not standard*. It looks like a GCC extension. MSVC rejects it. (*And by "not standard" I mean Intel does not specify it in the intrinsics specification. All of this is already non-standard as far as C/C++ goes.) – Mysticial Dec 05 '13 at 17:08
  • Does that cast work with gcc? The doc explicitly says it doesn't, so it looks like a bug. – Marc Glisse Dec 05 '13 at 17:11
  • Thanks Mysicical. That's what I suspected. I saw it used at [fastest-way-to-do-horizontal-vector-sum-with-avx-instructions](http://stackoverflow.com/questions/9775538/fastest-way-to-do-horizontal-vector-sum-with-avx-instructions/20389775#20389775) – Z boson Dec 05 '13 at 17:20
  • @MarcGlisse, I have not tested the second case with any compiler. I just saw it used in the answer to the question above. – Z boson Dec 05 '13 at 19:29

0 Answers0