There is something different with 1U << 32 and first assign 32 to a variable (for example, n) then left shift n. I tried with printf("%u\n", 1U << 32)
, compilers will optimize the result to 0. But when I try this code,
#include <stdio.h>
int main() {
int n = 32;
printf("%u\n", 1U << n);
}
Compiling and executing the code above will print the result of 1.
According to C/C++ standard,
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1×2^E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2^E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.
In my opnion, when E1 is unsigned and E1 << E2 overflows, the result will be E1 * 2 ^ E2 mod (UINT_MAX + 1), so the result of 1U << n should be 0.
But compiling with GCC 4.4 to 7.2 or Clang 3.1 to 5.0 on x86 and ARM gave the result of 1. I checked the assembly code and found both produced the following assembly,
movl $20,-0x4(%rbp)
mov -0x4(%rbp),%eax
mov $0x1,%edx
mov %eax,%ecx
shl %cl,%edx
and
orr w8, wzr, #0x1
orr w9, wzr, #0x20
stur w9, [x29, #-4]
ldur w9, [x29, #-4]
lsl w1, w8, w9
And then I checked the instruction shl
and lsl
, on c9x.me I found following description about shl
,
The destination operand can be a register or a memory location. The count operand can be an immediate value or register CL. The count is masked to 5 bits, which limits the count range to 0 to 31.
And assembler guide of ARM tells that the lsl
's allowed shifts is 0-31.
It means at least the instruction shl
works well, that is the reason why I suspect the implementation of compilers are wrong, but it seems impossible to have such a bug for so wide and long, could any one explain to me?
Thanks.