Bit right shifting of C language in macOS 10.12 is different from Ubuntu 16.04

Question

Update : I just rewrote the function in a new C source file on macOS:

#include <stdio.h>

int main() {

int x = 0xffffffff;

int m2 = (((((0x55 << 8) + 0x55 )<< 8) + 0x55)<< 8) + 0x55;
printf("m2 : 0x%x\n",m2);

int m4 = (((((0x33 << 8) + 0x33 )<< 8) + 0x33)<< 8) + 0x33;
printf("m4 : 0x%x\n",m4);

int m8 = (((((0x0f << 8) + 0x0f )<< 8) + 0x0f)<< 8) + 0x0f;
printf("m8 : 0x%x\n",m8);

int m16 = (0xff << 16) + 0xff ;
printf("m16 : 0x%x\n",m16);

int p1 = (x & m2) + ((x >> 1) & m2);
printf("p1: 0x%x\n",p1);

printf("p1 & m4 : 0x%x\n",p1 & m4);

printf("p1 >> 2 : 0x%x\n",p1 >> 2);

printf("(p1 >> 2) & m4 : 0x%x\n",(p1 >> 2) & m4);

int p2 = (p1 & m4) + ((p1 >> 2) & m4);
printf("p2 : 0x%x\n",p2);

int p3 = (p2 & m8) + ((p2 >> 4) & m8) ;
printf("p3 : 0x%x\n",p3);

int p4 = (p3 & m16)  + ((p3 >> 8) & m16) ;
printf("p4 : 0x%x\n",p4);
//int p4 = p3   + (p3 >> 8)  ;
int p5 = p4 + (p4 >> 16) ;

printf("BigCount result is : 0x%x\n",p5 & 0xFF);
}

And the printed result is :

everything is as same as the one in Ubuntu. This makes me more confused.

When I run this function in macOS 10.12, it gave an unexpected answer. The input x is 0xffffffff (-1). The function is written in C language:

int bitCount(int x) {

  int m2 = (((((0x55 << 8) + 0x55 )<< 8) + 0x55)<< 8) + 0x55;
  int m4 = (((((0x33 << 8) + 0x33 )<< 8) + 0x33)<< 8) + 0x33;
  int m8 = (((((0x0f << 8) + 0x0f )<< 8) + 0x0f)<< 8) + 0x0f;
  int m16 = (0xff << 16) + 0xff ;

  int p1 = (x & m2) + ((x >> 1) & m2);
  int p2 = (p1 & m4) + ((p1 >> 2) & m4); //line 7
  int p3 = (p2 & m8) + ((p2 >> 4) & m8) ;
  int p4 = (p3 & m16)  + ((p3 >> 8) & m16) ;
  //int p4 = p3   + (p3 >> 8)  ;
  int p5 = p4 + (p4 >> 16) ;

  return p5 & 0xFF;
}

When I track all the local variable by print it, I just found that the:

((p1 >> 2) & m4) //line7

printed the value of '0x2222222'(seven '2' instead of eight '2').

This is so unexpected, the p1 prints '0x2aaaaaaa' and the m4 is '0x33333333', so it should be 0x22222222(eight '2;). However, when I run this in ubuntu 16.04, everything is just as expected, for the ((p1 >> 2) & m4) prints '0x22222222':

Do you run this on your Mac having the same problem? Does anything different in macOS lead to this problem?

Remember that `int` is *signed*. If you over or underflow a signed value, you will have [*undefined behavior*](https://en.wikipedia.org/wiki/Undefined_behavior). Try using `unsigned` instead for the data-type. Or more explicitly `uint32_t` or even a larger type like `uint64_t`. — Some programmer dude, Aug 14 '18 at 16:02
Please JUST include the line that gives an unexpected result — Tim Randall, Aug 14 '18 at 16:03
I also suggest you [learn how to debug your programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). Especially how to use a debugger to step through your code line by line while monitoring variables and their values. For this it also helps to break up complex opration into simples ones, using temporary variables. For example `int p1 = (x & m2) + ((x >> 1) & m2);` could be `unsigned p1_1 = x & m2; unsigned p1_2 = x >> 1; unsigned p1_3 = p1_2 & m2; unsigned p1 = p1_1 + p1_3;`. Makes it easier to see when and where there is a wrong result. — Some programmer dude, Aug 14 '18 at 16:07
@Someprogrammerdude I don’t think it’s overflow, it’s just right shifting — wuxue, Aug 14 '18 at 16:07
a) Never use bit operations on signed values. b) Never use code you do not understand and just copied from somewhere. — SergeyA, Aug 14 '18 at 16:09
@Someprogrammerdude I can’t change the data types because it’s school homework interface with limit — wuxue, Aug 14 '18 at 16:09
I believe right-shifting signed values is implementation-defined. It may sign-extend or zero-extend. — Christian Gibbons, Aug 14 '18 at 16:09
What is the point of the images? They're downright obnoxious. — Andrew Henle, Aug 14 '18 at 16:09
@ChristianGibbons it’s not just the sign-extend, but also losing the left most 4 bit — wuxue, Aug 14 '18 at 16:13
Than you should explain to your teacher why you can't use bitwise right-shift on signed values, and also complain to the faculty on the subpar teacher performance. — SergeyA, Aug 14 '18 at 16:14
Since the input is negative, the result of right shifting is implementation-defined. The `clang` compiler on macOS (even if disguised as `gcc`) may be doing things differently from `gcc` on Ubuntu — that's the compiler's prerogative. If you write non-portable code, you don't get portable results. — Jonathan Leffler, Aug 14 '18 at 16:32
If you just want to count bits efficiently (rather than trying to fix this buggy code), see https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer . — Steve Summit, Aug 14 '18 at 17:43
"The input x is 0xffffffff (-1)." _discusses_ input. Better to post an [mcve] to insure that is true. Much about this post lacks clarity due to incomplete code. — chux - Reinstate Monica, Aug 14 '18 at 19:12
"p1 prints '0x2aaaaaaa' ..." Code does _not_ print anything. AFAIK, the code used to print output is fouled. Posting compilable/run-able code is useful. — chux - Reinstate Monica, Aug 14 '18 at 19:17

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

3

Per 6.5.7 Bitwise shift operators, of the C standard (note the highlighted parts):

Constraints

2 Each of the operands shall have integer type.

Semantics

3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 x 2E2 , reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2 . If E1 has a signed type and a negative value, the resulting value is implementation-defined.

You are bit-shifting signed integer values. Depending on your inputs, you are depending on both undefined and implementation-defined behavior.

Different results on different platforms are to be expected.

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 14 '18 at 16:14

Andrew Henle

32,625
3
24
56

Do you mean the variable m4 is doing a undefined behavior? – wuxue Aug 14 '18 at 16:20
1

@wuxue After quick look at the code you posted, I don't think offhand that the `m2`, `m4`, `m8`, or `m16` values a undefined or implementation-defined behavior because none of the inputs there have an actual negative value and I don't think you overflow. But the right-shift operations, such as `(x >> 1)`, are definitely going to be implementation-defined if you pass in an `int` value that's negative. And I suspect some of your intermediate `p1`, `p2`, etc results may wind up as negative values, which also leads to implementation-defined behavior. A simple test would be to use `unsigned`. – Andrew Henle Aug 14 '18 at 16:30
Given that the input value to the function ix 0xFFFFFFFF or -1, there is implementation-defined behaviour at work. – Jonathan Leffler Aug 14 '18 at 16:33
@JonathanLeffler *there is implementation-defined behaviour at work.* Agreed. I haven't done a thorough enough analysis of the posted code to rule out UB, though. If you have, feel free to edit or post your own analysis. – Andrew Henle Aug 14 '18 at 16:35
1

My analysis stopped at: '`x` is negative — therefore, there are portability problems'. I also note that when I run `/usr/bin/gcc --version` on a Mac, I get output like `Apple LLVM version 9.1.0 (clang-902.0.39.2)` — `Target: x86_64-apple-darwin17.7.0` indicating that it is Clang in disguise. I personally use a 'real' GCC that I've compiled (currently, GCC 8.2.0) most of the time. The values in `m2`, `m4`, `m8`, `m16` are do not self-evidently overflow (though why they aren't simply written as 0x55555555 etc eludes me). The `(x >> 1)` term invokes implementation-defined behaviour — else OK. – Jonathan Leffler Aug 14 '18 at 16:47
And of course it's all a tremendous waste of time, because bit counting is (or ought to be) a solved problem. – Steve Summit Aug 14 '18 at 17:40
Thanks. I just cut the function to my new C source file and print everything in macOS, the result is all as same as in Ubuntu. I will update it in the question. @J – wuxue Aug 15 '18 at 01:25

Bit right shifting of C language in macOS 10.12 is different from Ubuntu 16.04

1 Answers1