25

According to the answer to this questions:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Which seems to imply that 1 << 31 is undefined.

However GCC doesn't issue a warning if I use 1 << 31. It does issue one for 1 << 32. link

So which is it? Am I misunderstanding the standard? Does GCC have its own interpretation?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Ilya Lesokhin
  • 359
  • 2
  • 7
  • 5
    Possible duplicate of [Meaning of ((float) rand() / (float)((1 << 31) - 1))](https://stackoverflow.com/questions/38771436/meaning-of-float-rand-float1-31-1) – Brian Cain Jul 23 '17 at 18:24
  • 2
    @BrianCain no, rather, that should be closed as a duplicate of **this** – Antti Haapala -- Слава Україні Jul 23 '17 at 20:01
  • There's two separate questions here: (a) is `1 << 31` undefined (which is covered by many existing questions already), and (b) why doesn't gcc give a a warning; which is not a language-lawyer question. I would suggest editing the question to clearly be one of those two, but not both – M.M Jul 23 '17 at 21:20
  • It's your responsibility to watch for UB, not compiler's. A compiler may or may not help you by issuing a warning. – n. m. could be an AI Jul 24 '17 at 13:50
  • For C++ [see this comment here for some interesting details](https://stackoverflow.com/questions/19593938/is-left-shifting-a-negative-integer-undefined-behavior-in-c11#comment29091986_19593938) – Shafik Yaghmour Jul 27 '17 at 06:31

3 Answers3

22

No: 1 << 31 has undefined behavior if the type int has only 31 value bits.

1U << 31 is OK and evaluates to 0x80000000 if type unsigned int has 32 value bits.

On a system where bytes have 8 bits, sizeof(int) == 4 means int has at most 31 value bits, so shifting 1 by 31 places is undefined. Conversely, on a system where CHAR_BIT > 8, it may be OK to write 1 << 31.

gcc might issue a warning if you raise the warning level. try gcc -Wall -Wextra -W -Werror. clang does issue a warning with the same options.

To address Michaël Roy's comments, 1 << 31 does not evaluate to INT_MIN reliably. It might give this value on your system, but the Standard does not guarantee it, in fact the Standard describes this as undefined behavior, so not only can you not rely on it, you should avoid it to avoid spurious bugs. The optimizers routinely take advantage of potential undefined behavior to remove code and break the programmers' assumptions.

For example, the following code might compile to a simple return 1;:

int check_shift(int i) {
   if ((1 << i) > 0)
       return 1;
   else
       return 0;
}

None of the compilers supported by Godbolt's compiler explorer do, but doing so would not break conformity.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1 << 31 always gives out INT_MIN, so it's not undefined. – Michaël Roy Jul 23 '17 at 18:30
  • 13
    @MichaëlRoy - It is, by definition, undefined. – Oliver Charlesworth Jul 23 '17 at 18:33
  • Why doesn't GCC issue a warning? – Ilya Lesokhin Jul 23 '17 at 18:33
  • @user2301454: It'd be easier to answer your question if you provided the source and the compiler flags. – Bjorn A. Jul 23 '17 at 18:34
  • 5
    @IlyaLesokhin: "Undefined behavior" does not mean "all compilers must generate a warning". – Ben Voigt Jul 23 '17 at 18:36
  • @OliverCharlesworth It is defined, as a bit operation. undefined means unpredictable in our field. a bit shift is _not_ an arithmetic operation. But a bit manipulation. – Michaël Roy Jul 23 '17 at 18:36
  • 6
    @MichaëlRoy: it might give this value on your system, but the Standard does not guarantee it, in fact the Standard describes this as undefined behavior, so not only can you not rely on it, you should avoid it to prevent spurious bugs. The optimizers routinely take advantage of potential undefined behavior to remove code and break the programmers assumptions. – chqrlie Jul 23 '17 at 18:38
  • @MichaëlRoy - We're talking about C, not ASM. "Undefined" specifically means "the standard doesn't define it". It may well compile to a well-defined ASM instruction, but that doesn't mean that (for example) it won't cause nasty bugs when the optimiser runs and assumes code with well-defined behaviour. – Oliver Charlesworth Jul 23 '17 at 18:38
  • @OliverCharlesworth A bit-shift is not an arithmetic operation. – Michaël Roy Jul 23 '17 at 18:39
  • 2
    @MichaëlRoy - I'm not sure how that's relevant? – Oliver Charlesworth Jul 23 '17 at 18:40
  • 0x80000000 is a valid bit collection. That's relevant. A bit shift is _not_ a multiplication. – Michaël Roy Jul 23 '17 at 18:41
  • 6
    @MichaëlRoy: Again, we're not talking about asm/hardware here. We're talking about the C language. C compilers/optimisers work in the latter domain. If you want a well-defined way to obtain `0x80000000` **in C** then you should do `1U << 31`. – Oliver Charlesworth Jul 23 '17 at 18:42
  • 4
    @MichaëlRoy "*If E1 has a signed type and nonnegative value, and E1 x 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.*" `E1 x 2^E2 = 1 x 2^31` in this case. Which is `2147483648` and no, it is not representable in 31 bits (or 32 including sign). `(int)0x80000000` is not `2147483648` – Eugene Sh. Jul 23 '17 at 18:43
  • You are confusing bit shifting and multiplication. – Michaël Roy Jul 23 '17 at 18:43
  • 6
    @MichaëlRoy *You are confusing bit shifting and multiplication.* No, he is not. He's actually ***quoting*** [**6.5.7 Bitwise shift operators**, paragraph 4](http://port70.net/~nsz/c/c11/n1570.html#6.5.7p4) of the [C standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf). NB the title of section 6.5.7. – Andrew Henle Jul 23 '17 at 18:59
  • 7
    @MichaëlRoy: I'm afraid the language of the C Standard is clear. It is worded this way because not all computers are intel PCs. If on a different CPU, shifting a signed value might not overflow into the sign bit, or trigger an exception, the CPU might not use two's complement representation... the behavior of `1 << 31` would give a different result on these CPUs. This is a typical reason for the Standard to describe a border case as **undefined behavior**. The consequences of relying on behavior unwarranted by the C Standard can be hard to find bugs. – chqrlie Jul 23 '17 at 19:02
  • Note that if you really want the result to be signed, you can use `(1<<30)*-2` or similar. – R.. GitHub STOP HELPING ICE Jul 23 '17 at 19:17
  • @OliverCharlesworth `1U << 31` would not obtain `0x80000000` on a system with 16-bit int. `1UL << 31` or `(uint32_t)1 << 31` would be better – M.M Jul 23 '17 at 21:19
  • @R. Yep, assuming 2's complement representation, 31 value bits and no padding bits, `(1<<30)*-2` is a contorted but correct way to compute `INT_MIN`. – chqrlie Jul 23 '17 at 21:21
  • @MichaëlRoy A bit shift is an arithmetic operation (in the C Standard). [ref](https://stackoverflow.com/questions/24665835/what-is-the-definition-of-arithmetic-operation-in-c99) – M.M Jul 23 '17 at 21:24
  • Are you missing a "not" in "does evaluate to INT_MIN reliably"? – user2357112 Jul 23 '17 at 21:58
17

The reason GCC doesn't warn about this is because 1 << 31 was valid (but implementation-defined) in C90, and is valid (but implementation-defined) even in modern C++. C90 defines << as a bit shift and followed by saying that for unsigned types, its result was that of a multiplication, but did no such thing for signed types, which implicitly made it valid and left it covered by the general wording that bitwise operators have implementation-defined aspects for signed types. C++ nowadays defines << as multiplying to the corresponding unsigned type, with the result converted back to the signed type, which is implementation-defined as well.

C99 and C11 did make this invalid (saying the behaviour is undefined), but compilers are permitted to accept it as an extension. For compatibility with existing code, and to share code between the C and C++ frontends, GCC continues to do so, with one exception: you can use -fsanitize=undefined to get detected undefined behaviour to abort your program at run-time, and this one does handle 1 << 31, but only when compiling as C99 or C11.

  • Interesting. So that must mean that C++ now defines out-of-range conversions from unsigned to signed that would previously have been undefined. I didn't know that :) – Oliver Charlesworth Jul 23 '17 at 20:28
  • 1
    @OliverCharlesworth As far as I know, out-of-range conversions to signed integer types from another integer type, whether signed or unsigned, whether the same width or larger, have always been valid but implementation-defined in all versions of C and C++. Some versions only allow an implementation-defined value, others also allow an implementation-defined signal to be raised, but it's never undefined. –  Jul 23 '17 at 20:37
  • I would say that in C90 it was underspecified or defective for this case. As opposed to C++14 for example which specifically covers the case as being implementation-defined – M.M Jul 23 '17 at 21:16
  • @M.M C90 does appear to leave it underspecified in the description for `<<`, but has a general clause stating that the binary operators have implementation-defined aspects for signed types (directly under "Expressions"). That applies here too, that's why it's not simply undefined by omission. I'll edit my answer to cover this. –  Jul 23 '17 at 21:27
  • @M.M: In C89/90, the behavior of the left-shift operator was defined in terms of the representation (meaning that ones'-complement systems had to [somewhat illogically, IMHO] were required to evaluate -1<<1 as -3). C99 changed the rules; no rationale was given, but the most logical reason would have been to give non-two's-complement systems more flexibility. Somewhat ironic given how many non-two's-complement production-worthy C99 or C11 implementations there are (so far as I can tell, the number is and always will be zero). – supercat Jul 23 '17 at 21:34
  • @M.M: I doubt very much that the authors of the Standard intended to change the behavior of two's-complement systems, except possibly in diagnostic builds which were intended to warn of portability issues that could arise if code were ported to non-two's-complement systems. I don't think C++ bends over backward to accommodate non-two's-complement architectures, and thus wouldn't need to grant such an allowance. – supercat Jul 23 '17 at 21:35
  • @supercat Wrong, it was implementation-defined as I mentioned already. Official source: http://www.open-std.org/JTC1/SC22/WG14/www/docs/dr_081.html –  Jul 23 '17 at 21:37
  • Thanks for teaching me something new (the change in wording to consider whether the result is representable in the corresponding unsigned type). However, while it is allowed in C++17, it is definitely UB in C++11, and for C++03 it is not clear. In particular, I can find nothing that requires the sign bit to be "left" of the value bits, although clearly conversions between signed and unsigned are more efficient if this choice is made. – Ben Voigt Jul 24 '17 at 04:55
  • Well, maybe the "pure binary numeration" requirement would, although the note saying sign-magnitude is allowed and the footnote elaborating the meaning of "pure binary" are mutually incompatible. – Ben Voigt Jul 24 '17 at 04:55
  • 1
    @BenVoigt It's seen as a defect in C++11, CWG #1457, so the change is retroactively applied. As for C++03, this mostly uses the C90 wording for the left shift operation, but I'm not finding the matching wording to say that the bitwise operators have implementation-defined aspects for signed types. If that wording isn't present, I think you have a good point there. –  Jul 24 '17 at 05:21
  • If the behavior of `<<` on signed values were implied by an implementation's choice of integer representation, that would imply that the behavior of `<<` had implementation-defined aspects. To claim that the statement that `<<` has implementation-defined aspects would allow implementations to define it however they see fit is thus a total non-sequitur. – supercat Jul 24 '17 at 22:26
  • @hvd: [see above] – supercat Jul 24 '17 at 22:29
  • @hvd: The C89 wording explicitly says that vacated bits will be filled with zero. If the intention had been that vacated bits will be filled with zero except in some cases where it would make more sense to do otherwise, the Standard should either identified cases where that was allowed, or else limited the statement to cases where filling in zero was actually required behavior. I fail to see any reasonable reading of the C89 Standard under which a ones'-complement implementation would be allowed to fill vacated bits with 1s when left-shifting a negative number. – supercat Jul 24 '17 at 22:34
  • @supercat I direct you to my earlier comment where I provide you with a link where the C committee official's response is "Subclause 6.3 states that the binary operator [<<], among others, has implementation-defined aspects for signed types. Therefore, the answer to ``What does it mean to left shift a signed value?'' is that it is implementation-defined." If you have an issue with that interpretation, this is not the place to discuss it. –  Jul 24 '17 at 22:38
  • @hvd: The published draft of the C89 Standard requires that implementations define the behavior *in a fashion that meets certain constraints*, among which are that vacated bits must have the value zero. I see nothing in the published draft of the Standard that would allow ones'-complement implementations to do anything else with those bits. Is there something in the actual text of the C89 Standard that would allow for that, or are implementations required to define the behavior as vacating the bits as the text of the draft would imply? – supercat Jul 25 '17 at 00:04
  • @hvd: As a point of clarification, C89 would allow an implementation to have the sign bit of its signed type be a padding bit in its unsigned type, and would allow such an implementation considerable freedom with how left-shifting a negative number would work. That does not imply similar freedom exists on the 99% of implementations whose integer types have no padding bits. – supercat Jul 25 '17 at 14:55
9

It does invoke undefined behaviour, as explained by the other answers/comments. However, as to why GCC doesn't emit a diagnostic.

There are actually two things that can lead to undefined behaviour for a left-shift (both from [6.5.7]):

  1. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

  2. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Evidently GCC detects the first one (because it's trivial to do so), but not the latter.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • 1
    Another explanation might be that gcc behaves "predictably" for this situation and the gcc developers don't want to bug the users with warnings for code that is not broken on gcc (for the selected target). It's not their responsibility to warn about code that might break on other platforms or compilers – M.M Jul 23 '17 at 21:34
  • The authors of the C++ Standard decided for whatever reason to define `1<<(bitsize-1)` as yielding ~INT_MAX on two's-complement machines, even though it involves overflow, but decided not to define left-shifts of negative values even on such machines, even in cases that don't involve overflow. – supercat Oct 25 '17 at 19:55