3

This question is inspired by other questions from StackOverflow. Today, while browsing StackOverflow, I've come across an issue of bitshifting a variable by a value k that is >= the width of that variable in bits. This means shifting a 32-bit int by 32 or more bit positions.

Left shift an integer by 32 bits

Unexpected C/C++ bitwise shift operators outcome

From these questions, it is obvious that if we attempt to shift a number by k bits that are >= the bit width of the variable, only the least-significant log2k bits are taken. For a 32-bit int, the least significant 5 bits are masked and taken to be the shift amount.

So in general, if w = width of the variable in bits, x >> k becomes x >> (k % w) For an int, this is x >> (k % 32).

The count is masked to five bits, which limits the count range to 0 to 31.

So I've written a small little program to observe the behavior that should theoretically be produced. I've written in comments the resulting shift amount % 32.

#include <stdio.h>
#include <stdlib.h>

#define PRINT_INT_HEX(x) printf("%s\t%#.8x\n", #x, x);

int main(void)
{
    printf("==============================\n");
    printf("Testing x << k, x >> k, where k >= w\n");

    int      lval = 0xFEDCBA98 << 32;
    //int      lval = 0xFEDCBA98 << 0;

    int      aval = 0xFEDCBA89 >> 36;
    //int      aval = 0xFEDCBA89 >> 4;

    unsigned uval = 0xFEDCBA89 >> 40;
    //unsigned uval = 0xFEDCBA89 >> 8;

    PRINT_INT_HEX(lval)
    PRINT_INT_HEX(aval)
    PRINT_INT_HEX(uval)

    putchar('\n');

    return EXIT_SUCCESS;
}

And the output does not match the expected behavior of the shift instructions!

==============================
Testing x << k, x >> k, where k >= w
lval    00000000 
aval    00000000
uval    00000000

=====================================================================

Actually I was a bit confused with Java. In C/C++, shifting an int by a number of bits greater than the bit width, could possibly be reduced k % w, but this is not guaranteed by the C standard. There is no rule which says that this kind of behavior should happen all the time. It is undefined behavior.

However, this is the case in Java. This is a rule of the Java programming language.

Bitshift operators description in Java language specification

Weird result of Java Integer left shift

java : shift distance for int restricted to 31 bits

Galaxy
  • 2,363
  • 2
  • 25
  • 59
  • 1
    Well, I can't speak to what the language specs say should happen, but I always think about shifting as removing a bit from one side and inserting a `0` at the other side. So, using this logic, shifting by a number of bits greater than or equal to the width of your variable would certainly end up producing all zeroes. The questions you link state the behavior is undefined, so all bets are off. – Stephen Docy Jul 03 '18 at 01:56
  • 3
    The question you linked to cites the standard: The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand. – Mark Plotnick Jul 03 '18 at 01:57
  • Well, if 36 % 32 == 4, then it should shift the variable by 4 bytes, coming nowhere close to zeroing out the variable. – Galaxy Jul 03 '18 at 01:57
  • This is undefined behavior, everything can happen. Now [bitshift](https://c9x.me/x86/html/file_module_x86_id_285.html) on x86 are divisions, so you will end up with 0s. – KamilCuk Jul 03 '18 at 01:58
  • 1
    Where are you getting the idea of reducing the number of positions shifted via modulo? I don't see that in either of the accepted answers you reference. – Stephen Docy Jul 03 '18 at 02:01
  • "it is obvious that if we attempt to shift a number by k bits that are >= the bit width of the variable, only the least-significant log2k bits are taken"... No, it is not "obvious". On the contrary, it is simply incorrect. Read the linked answers carefully. There's no `log2` there and there's no `36 % 32` there. The behavior is simply undefined. End of story. – AnT stands with Russia Jul 03 '18 at 02:06
  • @KamilCuk: Hardware instructions do not define C implementations. The C implementation may use other instructions and combinations thereof to achieve different effects. And the page you link to does not say that shifting by more than the width will yield zero. It says “The count is masked to 5 bits, which limits the count range to 0 to 31.” – Eric Postpischil Jul 03 '18 at 02:34
  • Possible duplicate of [What does the C standard say about bitshifting more bits than the width of type?](https://stackoverflow.com/questions/11270492/what-does-the-c-standard-say-about-bitshifting-more-bits-than-the-width-of-type) – phuclv Jul 03 '18 at 03:16
  • Using `%x` to print an `int` adds an unnecessary compilation. – chux - Reinstate Monica Jul 03 '18 at 03:22
  • "For a 32-bit int, the least significant 5 bits are masked and taken to be the shift amount." --> C does not specify that a mask must occur. That is part of the UB about this. C11 §6.5.7 3 – chux - Reinstate Monica Jul 03 '18 at 03:25
  • 1
    What does Java have to do with anything? The last part of the question just seems to derail the topic. – Lundin Jul 03 '18 at 06:44

2 Answers2

6

The linked questions specifically state that shifting by an amount greater than the bit width of the type being shifted invokes undefined behavior, which the standard defines as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"

When you invoke undefined behavior, anything can happen. The program may crash, it may output strange results, or it may appear to work properly. Also, how undefined behavior manifests itself can change if you use a different compiler or different optimization settings on the same compiler.

The C standard states the following regarding the bitwise shift operators in section 6.5.7p3:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

In this case it's possible that the compiler could reduce the amount to shift modulo the bit width, as you suggested, or it could treat it as mathematically shifting by that amount resulting in all bits being 0. Either is a valid result because the standard does not specify the behavior.

dbush
  • 205,898
  • 23
  • 218
  • 273
3

One reason for the undefined-ness is that the 8086, the original x86, did not mask any bits off the shift count. It instead literally performed the shifts, using one clock tick per position.

Intel then realized that allowing 255+ clock ticks for a shift instruction perhaps wasn't such a good idea. They probably considered maximum interrupt response time, for example.

From my old 80286-manual:

To reduce the maximum execution time, the iAPX 286 does not allow shift counts greater than 31. If a shift count greater than 31 is attempted, only the bottom five bits of the shift count are used. The iAPX 86 uses all 8 bits of the shift count.

That gave you different results for the exact same program on a PC/XT and a PC/AT.

So what should the language standard say?

Java solved this by not using the underlying hardware. C instead chose to say that the effect is unclear.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • For the 80386 to have been analogous to the 80286, it should have looked at the bottom *six* bits when shifting a 32-bit value, since shifting by precisely the word size is sometimes a useful corner case. – supercat Jul 03 '18 at 14:54
  • IIRC, The x186 did mask the shift count and the x86 did not. One way to detect the difference at run time. – chux - Reinstate Monica Jul 03 '18 at 21:03
  • Save your google time - IIRC is for "if I recall correctly", not a computer model name. – aafulei Jul 06 '21 at 00:39