3

According to this interesting paper about undefined behavior optimization in c, the expression (x<<n)|(x>>32-n) "performs undefined behavior in C when n = 0". This stackoverflow discussion confirms that the behavior is undefined for negative integers, and discusses some other potential pitfalls with left-shifting values.

Consider the following code:

#include <stdio.h>
#include <stdint.h>

uint32_t rotl(uint32_t x, uint32_t n)
{
    return (x << n) | (x >> (32 - n));
}

int main()
{
    uint32_t y = rotl(10, 0);
    printf("%u\n", y);
    return 0;
}

Compile using the following parameters: -O3 -std=c11 -pedantic -Wall -Wextra

  • In gcc >5.1.0 the output of the program is 10.
  • In clang >3.7.0 the output is 4294967295.

Interestingly, this is still true when compiling with c++: gcc results, clang results.

Therefore, my questions are as follows:

  1. It is my understanding from the language in the standard that this should not invoke undefined / implementation defined behavior since both of the parameters are unsigned integers and none of the values are negative. Is this correct? If not, what is the relevant section of the standard for c11 and c++11?
  2. If the previous statement is true, which compiler is producing the correct output according to the c/c++ standard? Intuitively, left shifting by no digits should give you back the value, i.e. what gcc outputs.
  3. If the above is not the case, why are there no warnings that this code may invoke undefined behavior due to left-shift overflow?
ruser45381
  • 205
  • 1
  • 9

5 Answers5

7

From [expr.shift], emphasis mine:

The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.

You are doing:

(x >> (32 - n))

with n == 0, so you're right-shifting a 32-bit number by 32. Hence, UB.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • Is that directly from the standard? Why doesn't the compiler produce a warning? – ruser45381 Mar 04 '16 at 21:10
  • 2
    @ruser45381 Yes, it's a direct quote. Probably just can't figure out that it needs to warn (just typing `x >> 32` warns, for instance). Compiler doesn't *have* to warn, after all - it's a QoI issue. – Barry Mar 04 '16 at 21:13
  • It certainly doesn't **have** to warn, but then you end up in the very situation the linked paper mentions: invoking undefined behavior without realizing it and with no good way of optimizing the code to produce the correct result. Given that there are already [so many ways](https://news.ycombinator.com/item?id=11147441) of killing yourself with UB in C, I found it surprising that there wouldn't be any warning! – ruser45381 Mar 04 '16 at 21:22
  • @ruser45381 Yes, in an ideal world we would all have perfect tools? I don't know what to tell you. – Barry Mar 04 '16 at 21:26
  • I'm not attacking you! I just found the result surprising given that the compiler _could_ figure it out in this particular case. – ruser45381 Mar 04 '16 at 21:31
  • @ruser45381 I never thought you were? – Barry Mar 04 '16 at 21:33
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/105404/discussion-between-ruser45381-and-barry). – ruser45381 Mar 04 '16 at 21:36
  • What is "[expr.shift]"? – chux - Reinstate Monica Mar 04 '16 at 22:23
  • @chux It's the section of the C++ standard dealing with shift operators – Barry Mar 04 '16 at 22:30
2

Your n is 0, so performing x << 32 is an undefined behavior as shifting uint32_t 32 bits or more is undefined.

Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
1

If n is 0, 32-n is 32, and since x has 32 bits, x>>(32-n) is UB.

The issue in the linked SO post is different. This one has nothing to do with signedness.

rici
  • 234,347
  • 28
  • 237
  • 341
  • This explains the undefined behavior, but the fact that **neither** compiler gives a warning about this is a bit of a mystery. It should be able to deduce that the shift is 32 or greater, no? – ruser45381 Mar 04 '16 at 21:14
  • 2
    @ruser45381: In this case, it could figure it out, but it is under no obligation to do so. Undefined behaviour is *undefined*, not "the compiler should warn be about this." – rici Mar 04 '16 at 21:18
  • The point is that the number of shifts here is a parameter. In order to figure it out it has to perform a bit of static analysis, which is not necessarily it's function. – Eugene Sh. Mar 04 '16 at 21:31
  • @EugeneSh.: Curiously, even when it does do static analysis and inlines the function, resolving the parameter to a constant (and then eliminates the shift instruction because it can see that it would be a no-op), it *still* does not produce the warning message. I suppose that has to do with the way the warning is implemented. – rici Mar 04 '16 at 22:09
1

A part of the post not fully answered.

why are there no warnings that this code may invoke undefined behavior due to left-shift overflow?

Looking at the add() code, what should the compiler warn about? Is it UB if the sum is outside the range of INT_MIN ... INT_MAX. Because the following code does not take precautions to prevent overflow, like here, should it warn? Should you think so, then so much code would be waning about potential this and that, that programmers would quickly turn that warning off.

int add(int a, int b) {
  return a + b;
}

The situation is not much different here. If n > 0 && n < 32, there is no problem.

uint32_t rotl(uint32_t x, uint32_t n) {
  return (x << n) | (x >> (32 - n));
}

C creates fast code primarily because it lacks lots of run-time error checking and compliers are able to perform very nice optimized code. If one needs lots of run-time checks, there are other languages suitable for those programmers.

C is coding without a net.

Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

When the C Standard was written, some implementations would behave weirdly when trying to perform a shift by extremely large or negative amounts, e.g. left-shifting by -1 might tie up a CPU with interrupts disabled while its microcode shifts a value four billion times, and disabling interrupts for that long might cause other system faults. Further, while few if any implementations would do anything particularly weird when shifting by exactly the word size, implementations weren't consistent about the value returned. Some would treat it as a shift by zero, while others would yield the same result as shifting by one, word-size times, and some would sometimes do one and sometimes the other.

If the authors of the Standard had specified that shifting by precisely the word size may select in Unspecified fashion between those two possible behaviors, that would have been useful, but the authors of the Standard weren't interested in specifying all the things that compilers would naturally do with or without a mandate. I don't think they considered the idea that implementations for commonplace platforms wouldn't naturally yield the commonplace behavior for expressions like the "rotate" given above, and didn't want to clutter the Standard with such details.

Today, however, some compiler writers think it's more important to exploit all forms of UB for "optimization" than to support useful natural behaviors which had previously been supported by essentially all commonplace implementations. Whether or not making the "rotate" expression malfunction when y==0 would allow a compiler to generate a useful program which is smaller than would otherwise be possible is irrelevant.

supercat
  • 77,689
  • 9
  • 166
  • 211