4

In the following question: Why is processing a sorted array faster than processing an unsorted array? the accepted answer explains a way of avoiding branch prediction for the given example. He/she goes on to replace a conditional statement with some bitwise operators:

Replace:

if (data[c] >= 128)
    sum += data[c];

with:

int t = (data[c] - 128) >> 31;
sum += ~t & data[c];

however, in the line where he shifts to the right he seems to be doing so on a potentially signed value, which, as per this:

If the value after the shift operator is greater than the number of bits in the left-hand operand, the result is undefined. If the left-hand operand is unsigned, the right shift is a logical shift so the upper bits will be filled with zeros. If the left-hand operand is signed, the right shift may or may not be a logical shift (that is, the behavior is undefined).

is undefined behaviour. Can someone clarify if im missing something about the bitwise operations he is performing and may it also be explained what the bitwise operations performed are intended to do in layman's terms(given we interpret them as defined behaviour in some definition of shift for signed values).

Matias Chara
  • 921
  • 6
  • 21
  • What are you quoting from when you describe the behavior of the shift operator? There are mistakes in it (mainly the source uses "undefined" when it should be "implementation defined"). – 1201ProgramAlarm Jul 13 '20 at 00:25
  • it says implementation defined for sources that cite the c standard, but I couldnt find anything about the c++ standard. Anywho, I think implementation-defined and undefined behaviour imply the same thing: the behaviour is not guiaranteed to be consistent. – Matias Chara Jul 13 '20 at 00:28

1 Answers1

4

If data[c] >= 128, then data[c] - 128 will be non-negative, so it's sign bit will be 0. Otherwise, if data[c] < 128, then data[c] - 128 will be negative, so it's sign bit will be 1.

If we shift right by 31 (assuming an int is 32 bits, and right shifts are arithmetic, so yes implementation defined but pretty common), then we will have either 0x00000000 when data[c] >= 128 or 0xffffffff when data[c] < 128.

If we then bitwise invert that value, then we will have either 0xffffffff when data[c] >= 128 or 0x00000000 when data[c] < 128.

If we bitwise AND that value with data[c], then we will have either data[c] when data[c] >= 128 or 0x00000000 when data[c] < 128.

Adding 0 to sum does nothing, so we effectively only add the values that are greater or equal to 128 without branching.

pat
  • 12,587
  • 1
  • 23
  • 52
  • 1
    Great explanation, but to be clear, the bitwise shift for signed values is strictly implemenation defined and should not be relied on, right? – Matias Chara Jul 13 '20 at 00:33
  • 1
    Most platforms treat a right shift of a signed value as arithmetic and use 2's complement representation. You will have to decide whether your code will ever run on a platform where this is not the case. – pat Jul 13 '20 at 00:40
  • So we are shifting to the right but instead of prepending 0s we prepend 1s? – Matias Chara Jul 13 '20 at 00:42
  • 1
    An arithmetic right shift inserts copies of the sign bit. A logical right shift inserts 0's. For instance, an arithmetic 1 bit right shift on the value `-2` (`0xfffffffe`) yields `-1` (`0xffffffff`). The corresponding _logical_ right shift would yield `0x7fffffff`, which is `2147483647` – pat Jul 13 '20 at 01:09