Gather/Extract first Bit from Integer Array

Question

Problem

Does an instruction exist that gathers/extracts the first bit of an int[32] and stores it into a int?

I know about the intrinsic pext but that is not really what I want.
I do have a code for that but I thought maybe there a designated instruction.
The ints array is zero besides the first bit. Ergo, no masking needed.

Code

void ints2bits(int &bits, int *ints) {
   bits = (ints[0] << 0) + (ints[1] << 1) + ... + (ints[31] << 31);
}

UPDATE & FEEDBACK:

Just tested harold suggestions. It works very well and I can attain nice speed up.

If you want to extract first bit of all 31 integers of array and a store the bits at appropriate place in an int, then you code is wrong. — sameerkn, Aug 09 '16 at 05:48
It is not wrong. I literally use it. The question is only if there is a specific instruction for this operation. Masking of the array is not needed by the way, which is maybe the reason you claim its faultiness. — Armen Avetisyan, Aug 09 '16 at 07:39
Your code assumes that all the other bits in the ints are zero, you should probably state that explicitly in your problem specification. — samgak, Aug 10 '16 at 02:52
Why did you edit your question without accepting harold's answer? It's still the correct answer: vector left shift -> `movmskps` does exactly what you're asking for. — Peter Cordes, Aug 11 '16 at 07:21

score 2 · Accepted Answer · answered Aug 09 '16 at 12:01

There is no single instruction that can even read that much data, but groups of 4 (8 with AVX2) can be processed quickly with the use of _mm_movemask_ps. Ignore the fact that it claims to be a floating point instruction, it just gathers and appends the 4 top bits.

Of course moving the bottom bit into the top is easy with _mm_slli_epi32.

So putting it together (not tested)

int res = 0;
for (int i = 0; i < 32; i += 4) {
    __m128i x = _mm_load_si128((__m128i*)&ints[i]); // I assume it's aligned
    x = _mm_slli_epi32(x, 31);
    int bits = _mm_movemask_ps(_mm_castsi128_ps(x));
    res += bits << i;
}

The extension to AVX2 is pretty obvious.

An other possible approach is shifting each lane by a variable amount (pre-AVX2 this requires multiplication) and then summing, first vertically of course, saving the horizontal sum for last. This is likely slower, and certainly more awkward.

Gather/Extract first Bit from Integer Array

Problem

Code

UPDATE & FEEDBACK:

1 Answers1