Is right shifting undefined behavior for negative number in cpp and in java?

Question

To optimize my cpp code, I'm trying to use Right Shifting in some case. Here is an example:

int main()
{
    int i = (1 - 2) >> 31;  // sizeof(int) == 4
    ...
    ...
}

I've printed the i and I got -1. It means that it will use 1 instead of 0 to fill in the empty position if the number is negative. In other words, -1 >> 31 works as below:

1111...1    <--- the result of (1 - 2), which is -1
1111...1    <--- -1 >> 31, 1 is used to fill in the empty position

I just want to know if this behavior is clearly defined or not?

If it is UB in cpp, how about in Java?

Possible duplicate of [Is left-shifting (<<) a negative integer undefined behavior in C++11?](https://stackoverflow.com/questions/19593938/is-left-shifting-a-negative-integer-undefined-behavior-in-c11) — user0042, Aug 01 '17 at 05:49
In Java, definitely not undefined. `>>` does sign-extension, `>>>` does not. See [The Java™ Tutorials - Bitwise and Bit Shift Operators](https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html), or the definitive source, the Java Language Specification, [15.19. Shift Operators](https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.19). — Andreas, Aug 01 '17 at 05:49
*"how about in Java?"* - Why stop at Java, ask about it in C# and Rust as well. — StoryTeller - Unslander Monica, Aug 01 '17 at 05:54

score 1 · Accepted Answer · answered Aug 01 '17 at 05:50

1

Yes. It is implementation-defined.

According to C++03 5.8/3 which defines right-shifting:

The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 divided by the quantity 2 raised to the power E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

For more information, see this link.

answered Aug 01 '17 at 05:50

msc

33,420
29
119
214

https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array Could you read the fifth comment of the accepted answer in this question? I don't quite understand what he meant... IB or UB ... – Yves Aug 01 '17 at 05:56
@Yves If you mean the comment that says "Right-shifting a negative signed integer is IB", there isn't much one could add to it to make it clearer. – molbdnilo Aug 01 '17 at 09:16
@molbdnilo For now, my understanding is: don't shift any bit to overwrite the sign bit because it is UB, which means that Left Shifting a big signed integer is dangerous. Whereas Right Shifting is IB... – Yves Aug 01 '17 at 09:29
@Yves - It is dangerous not because it is UB, but because the result may vary between different systems. – Bo Persson Aug 01 '17 at 09:49

score 1 · Answer 2 · answered Aug 01 '17 at 06:30

By default it is signed int. The range is -32767 to 32767, bit wise range -111111111111111 to +111111111111111 the very first bit on the left acts as negative or positive indicator. And all the arithmetic operation will be done in 2's complement method. In general negative int are represents in two complements method i.e take you example how -1 is represent.

4 Bytes = 32 bits
0000 0000 0000 0000 0000 0000 0000 0000
how represent 1
0000 0000 0000 0000 0000 0000 0000 0001
Then we invert the digits. 0 becomes 1, 1 becomes 0.
1111 1111 1111 1111 1111 1111 1111 1110
Then we add 1.
1111 1111 1111 1111 1111 1111 1111 1111
This is how -1 is represented

The right-shift of a negative number is defined to shift in 1s to the highest bit positions, then on a 2s complement representation it will behave as an arithmetic shift - the result of right-shifting by N will be the same as dividing by 2N, rounding toward negative infinity. So shifting of -1 is -1 now take an other number For example, if you have the 8-bit 2s complement binary number let represent -3

    0000 0011
    Then we invert the digits.
    1111 1100
    Then we add 1. 
    1111 1101

11111101 representing -3 in decimal, and you perform an arithmetic right shift by 1 to give 11111110 representing -2 in decimal, this is the same as dividing -3 by 2^1, giving -1.5 which rounds towards negative infinity resulting in -2.

Nice answer for a newbie ... and just for the record: don't forget about accepting answers ... even works for questions that got closed on you. And happy upvoting, now that you reached that level ;-) — GhostCat, Sep 04 '17 at 08:46

score 0 · Answer 3 · edited Jun 20 '20 at 09:12

0

In Java, behavior of >> is well-defined for negative numbers (see below).

In C++, behavior of >> is undefined for negative numbers (see answer by rsp).

Quoting Java Language Specification, §15.19. Shift Operators:

The value of n >> s is n right-shifted s bit positions with sign-extension. The resulting value is floor(n / 2^s). For non-negative values of n, this is equivalent to truncating integer division, as computed by the integer division operator /, by two to the power s.

The value of n >>> s is n right-shifted s bit positions with zero-extension, where:

If n is positive, then the result is the same as that of n >> s.

If n is negative and the type of the left-hand operand is int, then the result is equal to that of the expression (n >> s) + (2 << ~s).

If n is negative and the type of the left-hand operand is long, then the result is equal to that of the expression (n >> s) + (2L << ~s).

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 01 '17 at 05:53

Andreas

154,647
11
152
247

For C++ the behavior *is* defined, but the definition is allowed to vary between different systems. Supposedly it does what the hardware does. – Bo Persson Aug 01 '17 at 09:47
@BoPersson Which means that if you don't know where your code will run, it's undefined, but I get your distinction. – Andreas Aug 01 '17 at 13:55
@Andreas: Undefined Behavior is interpreted by modern compilers as an invitation to behave nonsensically in the name of "optimization", which is totally different from yielding an Implemnentation-Defined or Unspecified value. For example, given `unsigned mulMod65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFF;}`, one might not see any reason that anything weird should happen if `x*y` exceeds `0x7FFFFFFF`, but in gcc such a function will sometimes have weird side-effects if `x*y` doesn't fit within `int`. Implementation-Defined allows no such license. – supercat Aug 01 '17 at 14:28
@supercat "Undefined" simply means that the specification/API doesn't define/specify the behavior. It doesn't (necessarily) mean random/arbitrary, just that an implementation can do what it likes, and is free to change what it likes at any time. Which means that *we*, as users of that functionality, cannot rely on any particular outcome. Question is without implementation context, and without knowing the exact implementation, "Implementation-Defined" effectively means the same as "undefined", i.e. you have no idea what will happen, hence `>>` is *undefined* for the purpose of this question. – Andreas Aug 01 '17 at 16:43
@Andreas: While some sane compilers will recognize that it makes sense to behave "...in a documented manner characteristic of the environment" [one of the suggested typical outcomes of UB listed in the Standard] when the underlying platform has a "natural" behavior, "modern" compilers can't be relied upon to do so. Evaluation of `(i*65535) & 65535u` within `for(int i=32767; i – supercat Aug 01 '17 at 19:58

Is right shifting undefined behavior for negative number in cpp and in java?

3 Answers3