I'm trying to implement strlen using SIMD AVX2 intrinsics, but when calling _mm256_cmpeq_epi8
, I sometimes get SIGSEGV 11 exception.
It works like 50% of the time. It's also called in a loop, but fails(if it does) only on the first iteration.
Here is the code:
size_t simd_strlen(const char *s) {
unsigned int i = 0;
const __m256i *p;
__m256i mask, zero;
p = (__m256i *) s;
zero = _mm256_setzero_si256();
while (true) {
// mask will always contain all zeros, unless \0 appears (all bits 0) => cmpeq will return 0xFF for that byte
mask = _mm256_cmpeq_epi8(*p, zero);
// if mask is all zeros, then each bit AND with itself == 0 => return is 1,
// only when there is at least one 1 in mask - return is 0, which means \0 occurred
if (!_mm256_testz_si256(mask, mask)) {
break;
}
++i;
++p;
}
int count = i * 32;
i = 0;
char *p_2 = (char *) p;
// add the rest
while (p_2[++i]) {
}
return count + i;
}