c/c++ left shift unsigned vs signed

Question

I have this code.

#include <iostream>

int main()
{
    unsigned long int i = 1U << 31;
    std::cout << i << std::endl;
    unsigned long int uwantsum = 1 << 31;
    std::cout << uwantsum << std::endl;
    return 0;
}

It prints out.

2147483648
18446744071562067968

on Arch Linux 64 bit, gcc, ivy bridge architecture.

The first result makes sense, but I don't understand where the second number came from. 1 represented as a 4byte int signed or unsigned is

00000000000000000000000000000001

When you shift it 31 times to the left, you end up with

10000000000000000000000000000000

no? I know shifting left for positive numbers is essentially 2^k where k is how many times you shift it, assuming it still fits within bounds. Why is it I get such a bizarre number?

[bizarre number?](https://www.wolframalpha.com/input/?i=2147483648) check the binary form — Bryan Chen, Apr 07 '14 at 06:06

Jerry Coffin · Accepted Answer · 2019-10-16T19:37:20.400

20

Presumably you're interested in why this: unsigned long int uwantsum = 1 << 31; produces a "strange" value.

The problem is pretty simple: 1 is a plain int, so the shift is done on a plain int, and only after it's complete is the result converted to unsigned long.

In this case, however, 1<<31 overflows the range of a 32-bit signed int, so the result is undefined¹. After conversion to unsigned, the result remains undefined.

That said, in most typical cases, what's likely to happen is that 1<<31 will give a bit pattern of 10000000000000000000000000000000. When viewed as a signed 2's complement² number, this is -2147483648. Since that's negative, when it's converted to a 64-bit type, it'll be sign extended, so the top 32 bits will be filled with copies of what's in bit 31. That gives: 1111111111111111111111111111111110000000000000000000000000000000 (33 1-bits followed by 31 0-bits).

If we then treat that as an unsigned 64-bit number, we get 18446744071562067968.

§5.8/2: ^{The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.}
In theory, the computer could use 1's complement or signed magnitude for signed numbers--but 2's complement is currently much more common than either of those. If it did use one of those, we'd expect a different final result.

edited Oct 16 '19 at 19:37

answered Apr 07 '14 at 06:10

Jerry Coffin

476,176
80
629
1,111

This makes sense. I wasn't sure if I was hitting undefined behavior or not. Thanks! I presume under a different compiler like msvc, I'd get something completely different? – No_name Apr 07 '14 at 06:11
@No_name: No guarantee, but chances are pretty decent that you'll just get what the hardware does either way, so two compilers on the same CPU will often produce the same result. – Jerry Coffin Apr 07 '14 at 06:15
You said "both plain `int`s" - it's only the type of `1` that matters; `1 << 31ull` is the same as `1 << 31`. – M.M Apr 07 '14 at 06:38
@MattMcNabb: Good point--although technically correct as it was, this was misleading. – Jerry Coffin Apr 07 '14 at 06:41
What does "reduced modulo one more than the maximum value representable in the result type" mean? – No_name Apr 07 '14 at 06:56
@No_name: On most real machines, it means they keep the low order bits that would fit (e.g., with 32-bit ints, they keep the 32 low-order bits of the result). – Jerry Coffin Apr 07 '14 at 06:58
strictly speaking it's undefined, but most implementations just apply sign extension and convert to unsigned. The result 18446744071562067968 = 0xFFFFFFFF80000000 – phuclv May 06 '14 at 05:58
You're assuming that `unsigned long int` is not 64 bits. Is that true of the architecture OP described? – Spencer Oct 16 '19 at 17:18
@Spencer: Quite the contrary, I assumed that `unsigned long int` *was* 64 bits, but `int` was 32 bits. But based on the results he showed, that's not so much an assumption as it is a conclusion based on the evidence in the question.`unsigned long int` had to be larger than 32 bits to represent the number: "18446744071562067968", but if `int` had been more than 32 bits, he'd have gotten the result he expected instead of what he actually did. – Jerry Coffin Oct 16 '19 at 17:32
@JerryCoffin OK, I see that now. But you might want to add those deductions about type size to your answer. – Spencer Oct 16 '19 at 17:42
@Spencer: I've added a bit more explanation about how things got from point A to point B. – Jerry Coffin Oct 16 '19 at 18:41

score 7 · Answer 2 · answered Apr 07 '14 at 06:08

The literal 1 with no U is a signed int, so when you shift << 31, you get integer overflow, generating a negative number (under the umbrella of undefined behavior).

Assigning this negative number to an unsigned long causes sign extension, because long has more bits than int, and it translates the negative number into a large positive number by taking its modulus with 2⁶⁴, which is the rule for signed-to-unsigned conversion.

score 3 · Answer 3 · answered Apr 07 '14 at 06:08

It's not "bizarre".

Try printing the number in hex and see if it's any more recognizable:

std::cout << std::hex << i << std::endl;

And always remember to qualify your literals with "U", "L" and/or "LL" as appropriate:

http://en.cppreference.com/w/cpp/language/integer_literal

unsigned long long l1 = 18446744073709550592ull;
unsigned long long l2 = 18'446'744'073'709'550'592llu;
unsigned long long l3 = 1844'6744'0737'0955'0592uLL;
unsigned long long l4 = 184467'440737'0'95505'92LLU;

score 1 · Answer 4 · answered Apr 07 '14 at 06:30

I think it is compiler dependent .
It gives same value
2147483648 2147483648
on my machiene (g++) .
Proof : http://ideone.com/cvYzxN

And if overflow is there , then because uwantsum is unsigned long int and unsigned values are ALWAYS positive , conversion is done from signed to unsigned by using (uwantsum)%2^64 .

Hope this helps !

score -2 · Answer 5 · answered Apr 07 '14 at 08:44

-2

Its in the way you printed it out. using formar specifier %lu should represent a proper long int

answered Apr 07 '14 at 08:44

Sks Sapare

37
1

c/c++ left shift unsigned vs signed

5 Answers5

Linked