1

I am trying to make a constant all binary ones __m256d variable. I saw the post Fastest way to set __m256 value to all ONE bits but it only handles the case of __m256i and __m256, not __m256d. Thank you for your help

Antonin GAVREL
  • 9,682
  • 8
  • 54
  • 81
Samantha92
  • 15
  • 2
  • 1
    What exactly do you mean? Do you want 4 times `1.0` ? Because Intel generally doesn't care that much what bits mean when they're copied. Broadcasting 64 bits works the same for `long long` and `double` – MSalters Mar 23 '21 at 16:44
  • 1
    The `__m256d` case is just a slightly different cast from the `__m256` case; I didn't think it was worth mentioning in my answer there because once you understand how/why the `__m256` version works, the `__m256d` version is straightforward. `_mm256_cast*` intrinsics just reinterpret the bits of a vector, aka type-pun. – Peter Cordes Mar 23 '21 at 17:15

1 Answers1

1

You should fill the bits to one as you did and then cast it to the __m256d register:

__m256i a = _mm256_set1_epi64x(-1);
__m256d b = _mm256_castsi256_pd(a);

Or simply:

__m256d b = _mm256_castsi256_pd(_mm256_set1_epi64x(-1));
Paul R
  • 208,748
  • 37
  • 389
  • 560
Antonin GAVREL
  • 9,682
  • 8
  • 54
  • 81