2

I need to shift the top bit from each element of b into the bottom of corresponding elements of a, like AVX512VBMI2 _mm256_shldi_epi16/32/64 with a count of 1.

Does someone know a way to shift this way?

Example:

__m256i x = { 11001100, 00110011, 11001100, 00110011,... x16 }
__m256i y = { 10111100, 10001011, 11000010, 01100111,... x16 }
__m256i res = _mm256_shldi_epi16(x,y);

Then res contains:

10011001, 01100111, 10011001, 01100110, ...x16

(editor's note: the question previously described this as _mm256_sllv_epi8. sllv is a variable-count shift where the count for each element comes from the corresponding element in the other source, and is nothing like a double-shift.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
yatsukino
  • 379
  • 1
  • 4
  • 13
  • 1
    there's a workaround for the [shift by one](https://stackoverflow.com/q/35002937/995714) case – phuclv Jul 23 '18 at 10:08

1 Answers1

4

Apparently the task is to shift bytes of a left by 1, while shifting in the top bit from the corresponding byte in b, like a tiny funnel shift with a fixed distance of 1. The shift left can be done with a byte addition, then copy that bit from b:

__m256i funnel_left1_epi8(__m256i a, __m256i b) {
    __m256i a2 = _mm256_add_epi8(a, a);
    __m256i bit_from_b = _mm256_and_si256(_mm256_srli_epi16(b, 7), _mm256_set1_epi8(1));
    return _mm256_or_si256(a2, bit_from_b);
}
harold
  • 61,398
  • 6
  • 86
  • 164