I tried to use SSE to accelerate task:
In high level aspect:
string a = "^&a&*";
string b = "abcdef";
bool c = a_contain_any_alphabet_in_b(a, b);
More detail using SSE(pseudo code):
a = _mm_set_epi8('^', .....);
b = _mm_set_epi8('a', .....);
mask = _mm_cmpestrm (a, la, b, lb, imm8); // _SIDD_CMP_EQUAL_ANY toggle
... and then extract mask
My problem is what if my b
contain more than 128 bits?
The situation such as I want to check string a
contain any alphabet(a~zA~Z) which recorded in b
. But set of alphabets are 8*52
bits which greater than 128
.
The naive approach I figured out is to separate b
into many __mm128i
.
mask1 = _mm_cmpestrm (a, la, b1, lb, imm8);
mask2 = _mm_cmpestrm (a, la, b2, lb, imm8);
...
and do some operation with all masks
I'm wondering are there any approach to do smarter?