5

Result of left shifting can be undefined behavior:

The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1×2^E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2^E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.

Result of right shifting can be implementation-defined:

The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1/2^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

Now, all the platforms I know, undefined behavior/implementation-defined actually do sensible things here:

  • left shifting a negative number is multiplication by 2^E2, just like if the number would be positive
  • right shifting a negative number is "arithmetic shift", which shifts the number normally, but puts the sign bit into most significant bits.

So, the question is, are there any 2-complement platforms/compilers, which doesn't behave like this?


Why the question? Most compilers don't emit optimal code (link provided by phuclv, check out the disassembly between test1 and test2) for power-of-2 division in certain circumstances (clang generates optimal code, though).

xskxzr
  • 12,442
  • 12
  • 37
  • 77
geza
  • 28,403
  • 6
  • 61
  • 135
  • https://en.wikipedia.org/wiki/Signed_number_representations#History - and no, 2's complement by definition have to behave like this (it's just how math works out), which is presumably why it ended up dominating the industry. – Amadan Aug 17 '18 at 09:58
  • 3
    Incidentally, the problem with power of 2 division not being optimized to shift for signed integers generally comes from the fact that it's not the same thing, due to the truncation semantics required by the C standard. -3/2 = -1, while -3>>1 = -2. – Matteo Italia Aug 17 '18 at 10:01
  • @MatteoItalia: for this example, they are the same, because the division is exact. – geza Aug 17 '18 at 10:03
  • The entire point of the quoted text is to point out that the first bullet is not correct. That it seems to work anyway is too common an accident. The first int32_t value where a single left shift corrupts the number is 0b10111...111, that is -1073741825. You don't need boutique hardware to see that go wrong, try it. – Hans Passant Aug 17 '18 at 11:22
  • @HansPassant: You give an example, when the number overflows. Of course, it cannot be correct. A simple `*2` wouldn't be correct for that number either. But if it doesn't overflow, it is correct on all the 2-complement platforms I know. – geza Aug 17 '18 at 11:32
  • Chicken-and-egg, it is the overflow that causes the UB. You are then seemingly asking for hardware where a left shift that doesn't overflow still produces garbage. Tough shopping. – Hans Passant Aug 17 '18 at 11:54
  • @HansPassant: No, as I read the standard. According to it, `-1<<1` is UB as well. But it is not an overflow, if we think about it as `-1*2`. `-1<<1` equals to `-2` on all HWs I know. – geza Aug 17 '18 at 12:03
  • @HansPassant: C89 defined the behavior of negative left-shift as multiplication on all platforms that use two's-complement representations without padding bits. The specified behavior would have been illogical on some other platforms, however. C99 changed it to UB but gave no rationale. The only sensible expectation I can see is that implementations for which the C89 definition made sense would continue to use it, but if any C99 implementations were ever produced for sign-magnitude machines, they could handle left shifts however the hardware did, even if they would trap. – supercat Aug 17 '18 at 16:38
  • @HansPassant: Performing math in two's-complement notation is equivalent to performing binary arithmetic on an infinitely long binary number, when all bits to the left of the sign bit have the same value as the sign bit. Overflows occur if one or more bits to the left of the sign bit would not match the sign bit. The value -1 has all bits set, including the sign bit and all bits to its left. -1<<1 also has the sign bit and all bits to its left set. INT_MAX<<1 and INT_MIN<<1 would set the sign bit to the state opposite all the bits to its left (overflow). – supercat Aug 17 '18 at 17:09

1 Answers1

1

Some compilers for processors that lack a sign-extending right-shift instruction have historically processed all right-shift operations as zero-filling.

The C89 Standard precisely defined the behavior of left-shifting a negative number on all ones'-complement or two's-complement platforms that don't use padding bits on either their signed or unsigned types. The behavior on ones'-complement machines was not equivalent to multiplication, however. Further, the proper behavior on sign-magnitude machines wasn't clear.

The C99 Standard changed a negative left-shift to Undefined Behavior. Nothing in the Rationale offers any reason for the change, nor even acknowledges it. The lack of mention in the rationale suggests that it was not seen as a breaking change, because there was no reason to expect that any implementations where the behavior was usefully defined under C89 wouldn't continue to define the behavior the same way whether or not the Standard continued to require it. The only intention that would make sense would be to allow ones'-complement or sign-magnitude implementations (in the event any were ever produced for C99) to behave in a fashion which may be more useful than what C89 had required.

When C89 or even C99 was written, the authors didn't perceive much difference between actions whose behavior is mandated by the Standard, versus actions which implementations could be expected to process in predictable and useful fashion whether the Standard required them to or not. Compiler writers seem to believe the Committee intended to eliminate from the language all actions of the latter form, but I've seen no evidence to suggest anything of the sort.

In practice, the only compilers I know of which don't presently abide that expectation with regard to left-shifts whose result would be representable as a multiplication that doesn't overflow are those which are explicitly configured to squawk on negative left-shifts purely because the Standard no longer defines their behavior. I would not be particularly surprised, however, if some "clever" person were to "optimize" a two's-complement compiler based on the fact that the Standard no longer requires implementations for such platforms to behave in the same sensible and useful way they always have. Such deviation could take two forms:

  1. A compiler could decide that if an operation that requires the carry flag be clear is preceded by a signed left shift, the "clear carry" instruction that would normally precede the latter operation could be omitted. I've used platforms where that would save an instruction, but all the compilers I've seen for such platforms clear the carry even though the Standard would not require it.

  2. A compiler could decide that it's "impossible" for the result of a left shift to be negative, and thus that any comparisons between that result and any negative values may be safely omitted. Going further, it could decide that it's likewise impossible for the operand to be negative, and remove any comparisons with negative numbers that would not prevent the left shift from being performed. Neither gcc nor clang attempts to impose such optimizations yet, but that doesn't mean they never will.

Quality implementations for two's-complement systems will process signed left-shift predictably unless or until there is a particularly compelling reason for them to do otherwise--something that seems rather unlikely to occur. As to what poor-quality implementations might do, who knows.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Nice answer, thanks! I wouldn't have thought that we need to go back to C89/C99 to answer this. Nice optimization examples! – geza Aug 17 '18 at 18:26