Given a __m128i
which stores 16 char
s, the even-index lane refers to even lane (i.e., lanes at 0, 2, 4, ..., 14), and odd-index lane refers to odd lane (i.e, lanes at 1, 3, 5, ... 15).
In my application, even/odd lane must be in given ranges. For example, suppose even_min
is 1, even_max
is 7, odd_min
is 5, and odd_max
is 10:
# valid
vec1: [1, 5, 6, 10, 2, 6, 4, 6, 2, 7, 4, 9, 2, 7, 4, 8]
# invalid because 0-th (even) is greater than even_max
vec2: [8, 5, 6, 10, 2, 6, 4, 6, 2, 7, 4, 9, 2, 7, 4, 8]
How to check whether it is valid more efficiently?
My current solution is very straightforward by checking the two comparison results respectively:
__m128i even_min = _mm_set1_epi8(xxx);
__m128i even_max = _mm_set1_epi8(xxx);
__m128i even_mask =
_mm_set_epi8(0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1);
__m128i evenRange = _mm_and_si128(_mm_cmpge_epi8(vec, even_min),
_mm_cmple_epi8(vec, even_max));
bool isEvenOk = _mm_testc_si128(evenRange, even_mask);
// the code for checking odd bytes is similar
Note that to compare unsigned chars using inclusive condition, two macros are defined as following:
#define _mm_cmpge_epi8(a, b) _mm_cmpeq_epi8(_mm_max_epu8(a, b), a)
#define _mm_cmple_epi8(a, b) _mm_cmpge_epi8(b, a)