Given a mask and a value, the mask covers the value if all bits from the value fall into the mask.
For example:
mask: 0b011010
value: 0b010010
true
or
mask: 0b011010
value: 0b010110
false
For int arr[arr_size]
, I need to calculate how many elements of the array are covered by the given mask.
My code:
int count = 0;
for (int index = 0; index < arr_size; index++)
{
// note: always true because of operator precedence
if (arr[index] | mask == mask)
count++;
}
or (slower)
int count = 0;
for (int index = 0; index < arr_size; index++)
{
// note: also broken because of operator precedence, but not always true
if (arr[index] & mask == arr[index])
count++;
}
My program very often needs to calculate the number of such array elements.
Can you tell me if there is any way to speed up such calculations? For example using SSE, AVX instructions.
P.S.
My code is turned into 5 instructions by the compiler (with optimizations enabled), but maybe you should use group instructions and this will give an additional speed gain
P.P.S. minimal code:
constexpt block_size = 16;
int arr[] = {rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), rand(), }; // random values for example
int mask = rand();
int count = 0;
for (__int64 cycles = 0; cycles < 0xFFFFFFFF; cycles ++)
{
for (int index = 0; index < block_size; index ++)
{
if (arr[index] | mask == mask)
count++;
}
}