MOVMSKB
does a really nice job of packing byte fields into bits.
However I want to do the reverse.
I have a bit field of 16 bits that I want to put into a XMM register.
1 byte field per bit.
Preferably a set bit should set the MSB (0x80) of each byte field, but I can live with a set bit resulting in a 0xFF result in the byte field.
I've seen the following option on https://software.intel.com/en-us/forums/intel-isa-extensions/topic/298374:
movd mm0, eax
punpcklbw mm0, mm0
pshufw mm0, mm0, 0x00
pand mm0, [mask8040201008040201h]
pcmpeb mm0, [mask8040201008040201h]
However this code only works with MMX registers and cannot be made to work with XMM regs because pshufw does not allow that.
I know I can use PSHUFB
, however that's SSSE3 and I would like to have SSE2 code because it needs to work on any AMD64 system.
Is there a way to do this is pure SSE2 code?
no intrinsics please, just plain intel x64 code.