1

After some operations I have an SSE register where each of the 16 bytes has the lowest bit set if some condition was fulfilled or it is 0 if it wasn't. I'd now would like to extract this into a bitmask where for each of these 16 bytes a bit is set iff the byte had value one.

I searched the Intel intrinsics guide up and down but couldn't really find how to do this. Pseudocode

void _mm_???(__m128i a)

FOR j := 0 to 15
   i := j*8
   IF a[i]
    a[j] := 1
   else
    a[j] := 0
   FI
ENDFOR
fschmitt
  • 3,478
  • 2
  • 22
  • 24
  • left shift by 7 bits (`_mm_slli_epi32` or whatever) to put your important bits at the top of each byte and `_mm_movemask_epi8` . Or `_mm_cmpeq_epi8` instead of shift. – Peter Cordes Nov 21 '19 at 08:30
  • _mm_movemask_epi8 is what I have been searching for all evening. Thanks a lot. Do you want to change it from comment to answer so I can accept it? – fschmitt Nov 21 '19 at 08:48
  • 3
    Found and cleaned up some of the duplicates from previous times this has been asked. [Extract the low bit of each bool byte in a \_\_m128i? bool array to packed bitmap](//stackoverflow.com/q/49263507) had an interesting idea for reducing back-end bottlenecks if your byte values are 0 or 1. And the __m64 MMX question also has an SSE2 answer and is a direct duplicate. Not easy to find; I found them by searching on the answer (`_mm_movemask_epi8`), so +1 for this question as a signpost for future searches. – Peter Cordes Nov 21 '19 at 09:19

0 Answers0