I have a vector int16_t beta = {1,1,0,0,0,0,0,0}
.
I want to implement this equation with AVX2
c[i] = a[i] + (-1)^beta[i] * b[i]
where a, b, c, and beta are all AVX2 vectors of int16_t
.
I have figured out that, if I can map 1 to -32768 multiplication operation can be avoided. I mean, flipping the sign of vector b can be done using OR and NEGATE functions of simd intrinsics.
I do know that 1 can be mapped to -32768 using left shift operation, however avx2 doesn't have any bit shift operations1. Is there any way to efficiently map 1 to -32768 with simd?
Editor's footnote 1: _mm256_slli_epi16(x, 15)
does in fact exist. But there are other ways to implement the whole formula so the question is interesting after all.