Unfortunately AVX2 does not have mask registers. What should I do if I want to accomplish _mm256_mask_add_ps
? Is there a way to do it without unpacking said mask into epi8 registers or the like?
Asked
Active
Viewed 236 times
0

bumpbump
- 542
- 4
- 17
-
1Get your mask a vector, and `_mm256_and_ps` one of the inputs to addition. ([is there an inverse instruction to the movemask instruction in intel avx2?](https://stackoverflow.com/q/36488675)). If you're generating the mask with `_mm256_cmp_ps` near where you're using it, don't `_mm256_movemask_ps` in the first place, just use the SIMD compare mask. – Peter Cordes Jun 16 '21 at 08:28
-
1And no, there's no way to use an integer bitmap to control a vector operation until AVX-512; as you say that's the whole point of mask registers, a new capability in AVX-512. – Peter Cordes Jun 16 '21 at 08:29
-
Typo: "Get your mask *as* a vector"... Also, found a 128-bit example of compare / AND / add: [Trying to add an \_\_m128 using an and mask in SSE programming](https://stackoverflow.com/q/15808565) – Peter Cordes Jun 16 '21 at 09:06