0

Unfortunately AVX2 does not have mask registers. What should I do if I want to accomplish _mm256_mask_add_ps? Is there a way to do it without unpacking said mask into epi8 registers or the like?

bumpbump
  • 542
  • 4
  • 17
  • 1
    Get your mask a vector, and `_mm256_and_ps` one of the inputs to addition. ([is there an inverse instruction to the movemask instruction in intel avx2?](https://stackoverflow.com/q/36488675)). If you're generating the mask with `_mm256_cmp_ps` near where you're using it, don't `_mm256_movemask_ps` in the first place, just use the SIMD compare mask. – Peter Cordes Jun 16 '21 at 08:28
  • 1
    And no, there's no way to use an integer bitmap to control a vector operation until AVX-512; as you say that's the whole point of mask registers, a new capability in AVX-512. – Peter Cordes Jun 16 '21 at 08:29
  • Typo: "Get your mask *as* a vector"... Also, found a 128-bit example of compare / AND / add: [Trying to add an \_\_m128 using an and mask in SSE programming](https://stackoverflow.com/q/15808565) – Peter Cordes Jun 16 '21 at 09:06

0 Answers0