Shifting unsigned int more than the size of it, undefined or not?

Question

Draft 2011 says:

6.5.7 Bitwise shift operators/4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

and

J.2 Undefined behavior An expression is shifted by a negative number or by an amount greater than or equal to the width of the promoted expression (6.5.7).

How to interpret both? Does J.2 refers to all shiftings (unsigned or not) or to the explicitly mentioned UB in section 6.5.7 (only for signed).

I mean, is unsigned int i=...; i <<= sizeof(i)*CHAR_BIT; UB?

J.2 refers to all shiftings, the result is undefined. Some implementations choose to evaluate `(uint32_t)x << 32` to `x`, some to `0`. Maybe `42` would be nice ;) — , Aug 01 '17 at 08:31
`sizeof(i)` returns the size in bytes (or more precisely, `CHAR_BIT`s). — vgru, Aug 01 '17 at 08:35
@Jean-BaptisteYunès no, `char` can have more than 8 bits **and** `unsigned int` is allowed to have *padding bits*. — , Aug 01 '17 at 08:35
@Jean-BaptisteYunès the simplest way to get it correct is just use a fixed-width type and hardcoded number of bits. For a generic solution, you need a [macro calculating the number of value bits from the maximum value](https://stackoverflow.com/a/4589384/2371524). — , Aug 01 '17 at 08:37
@FelixPalmen I know but in the case of most machines? CHAR_BIT=8 no padding? — Jean-Baptiste Yunès, Aug 01 '17 at 08:37
@Jean-BaptisteYunès well, *most* machines, but how is this helpful when asking specifically about UB? ;) — , Aug 01 '17 at 08:39
Ok, but as shifting only affect value and we don't have to care about padding bits, maybe extracting the number of bits from maximum value ? — Jean-Baptiste Yunès, Aug 01 '17 at 08:44
@Jean-BaptisteYunès see the link in my comment. This macro is very handy if you really need the exact amount of *value bits* in a given type. — , Aug 01 '17 at 08:50
Even if you want to "cheat" it by shifting using the variable number of shifts - `val <<= shift;` and it will be translated to : x86 `shl eax,cl` x64 `shl rax,cl` and those machine code instructions do not shift 32 or 64 times as the first one takes only 5 bits and the second 6 bit from the cl register. So the val will remain its initail value after the shift. — 0___________, Aug 01 '17 at 08:51
@PeterJ assembly is not that relevant here, this will be *very* different on different machines. But of course, these differences are a good rationale for the C standard actually declaring it *undefined*, so a compiler doesn't have to care about it. — , Aug 01 '17 at 08:58
That was the reason of my comment. Shifting more than the object size would require special translation to the machine code (ie spliting it into n operations). So it was logical to make it UB in the standard — 0___________, Aug 01 '17 at 09:01
It makes sense when looking at other machines as well. E.g. the `m68k` architecture has shift instructions that can shift by 1 to 8 bits, no matter how wide the operand is. A straight-forward compiler implementation would return `0` here for a shift by the whole width. If the whole world was `x86`, you could just define the result to be a no-op :) — , Aug 01 '17 at 09:08
x86, ARM, MIPS behave as I described PowerPc has 6 bit for it (as described in the tech reference to keep the same opcodes for future 64 bits versions).So because the direct translation does unpredicted hardware dependant results - it is logical to make it UB in the standard. From the mathematical point of view shifting something x times gives always predictable results (I abstract from the signed types) — 0___________, Aug 01 '17 at 09:17
There are different incarnations of the x86 as well. The original 8086/8088 *did* shift the full number of bits given by `CL`. So if you had a PC/XT and a PC/AT you could get different results from the same executable. Another reason for UB. — Bo Persson, Aug 01 '17 at 10:10
@BoPersson IIRC, that original shift, by maybe 255, made for a very bad worst case [Interrupt latency](https://en.wikipedia.org/wiki/Interrupt_latency) as that took 255 cycles. One of the reasons to only use the least significant bits of the shift count. — chux - Reinstate Monica, Aug 01 '17 at 11:41
@chux: The 80286 used enough bits to support shifts of up to the bit count, *inclusive8; the 80386, however, does not do so when using 32-bit operands. — supercat, Aug 01 '17 at 20:56

score 3 · Accepted Answer · answered Aug 01 '17 at 08:44

The paragraph above the one you quoted says the same thing, regardless of the signedness:

6.5.7 Bitwise shift operators / 3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

So, it's UB, whether it is unsigned or signed.

Thanks, I missed it! – Jean-Baptiste Yunès Aug 01 '17 at 08:45 — Jean-Baptiste Yunès, Aug 01 '17 at 08:45

score 0 · Answer 2 · answered Aug 01 '17 at 08:54

The standard is quite clear that this is indeed UB. In case you trust your compiler, you can also find out with this little test program:

#include <stdio.h>

#define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
                  + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))

#define UINT_BITS IMAX_BITS((unsigned)-1)

int main(void)
{
    unsigned int foo = 42;
    printf("%d", foo << UINT_BITS);
    return 0;
}

See what happens:

$ gcc -std=c11 -Wall -Wextra -pedantic -oshift shift.c
shift.c: In function 'main':
shift.c:11:22: warning: left shift count >= width of type [-Wshift-count-overflow]
     printf("%d", foo << UINT_BITS);
                      ^~
$ ./shift
42

Shifting unsigned int more than the size of it, undefined or not?

2 Answers2

Linked