arithmetic right shift shifts in 0s when MSB is 1

Question

As an exercise I have to write the following function:
multiply x by 2, saturating to Tmin / Tmax if overflow, using only bit-wise and bit-shift operations.
Now this is my code:

// xor MSB and 2nd MSB. if diferent, we have an overflow and SHOULD get 0xFFFFFFFF. otherwise we get 0.
int overflowmask = ((x & 0x80000000) ^ ((x & 0x40000000)<<1)) >>31;
                                                             // ^ this arithmetic bit shift seems to be wrong
// this gets you Tmin if x < 0 or Tmax if x >= 0
int overflowreplace = ((x>>31)^0x7FFFFFFF);

// if overflow, return x*2, otherwise overflowreplace
return ((x<<1) & ~overflowmask)|(overflowreplace & overflowmask);

now when overflowmask should be 0xFFFFFFFF, it is 1 instead, which means that the arithmetic bit shift >>31 shifted in 0s instead of 1s (MSB got XORed to 1, then shifted to the bottom).
x is signed and the MSB is 1, so according to C99 an arithmetic right shift should fill in 1s. What am I missing?

EDIT: I just guessed that this code isn't correct. To detect an overflow it suffices for the 2nd MSB to be 1.
However, I still wonder why the bit shift filled in 0s.

EDIT:
Example: x = 0xA0000000

x & 0x80000000 = 0x80000000 
x & 0x40000000 = 0 
XOR => 0x80000000 
>>31 => 0x00000001

EDIT:
Solution:

int msb = x & 0x80000000;
int msb2 = (x & 0x40000000) <<1;
int overflowmask = (msb2 | (msb^msb2)) >>31;
int overflowreplace = (x >>31) ^ 0x7FFFFFFF;
return ((x<<1) & ~overflowmask) | (overflowreplace & overflowmask);

Bitwise operations using *signed* values are always going to be a problem. — Some programmer dude, Jan 09 '20 at 11:17
so I just need to come up with a different solution because c doesn't work as it states? — Duke, Jan 09 '20 at 11:18
Can you please tell us some value of `x` where it doesn't work as you expect? And have you tried to divide the complex expressions you have into smaller and more easily debugable expressions to see that each small sub-expression gives the value you expect? — Some programmer dude, Jan 09 '20 at 11:29
@4386427 I didn't really mean that with c not working, I was just desperate :D But what should that link tell me? you only print x? — Duke, Jan 09 '20 at 11:46
What's happening is that the `0x80000000` in `x & 0x80000000` becomes an *unsigned* integer, making the whole expression unsigned. *Or* `0x80000000` is a `long` or `long long` value, which means it's not negative. So you're shifting an unsigned or otherwise non-negative number, which means it's filled up with zeros instead of ones. I'm trying to find a reference or quote about this behavior for an answer. — Some programmer dude, Jan 09 '20 at 11:52
@Some programmer dude Thank you! Is that because the constant 0x80000000 is compiled as an unsigned int? — Duke, Jan 09 '20 at 11:57
It's either an `unsigned int` *or* a 64-bit signed integer (`long` or `long long`, depending on platform and compiler). — Some programmer dude, Jan 09 '20 at 11:58
@Duke @Someprogrammerdude The type of a hexadecimal (or octal) integer constant with no suffix is the first of `int`, `unsigned int`, `long int`, `unsigned long int`, `long long int`, `unsigned long long int` in which its value can be represented. So, the type of `0x80000000` is `int` if that value can be represented as an `int`, otherwise `unsigned int` if that value can be represented as an `unsigned int`, and so on. — Lxer Lx, Jan 09 '20 at 15:17
regarding: *xor MSB and 2nd MSB. if different, we have an overflow* This is not true. As an example, -0 is 0x80000000 which MSB and 2nd MSB are different, but is not an overflow. the same holds for 0x04000000 and many other values — user3629249, Jan 10 '20 at 04:06
the problem your asking about is indicating a mis understanding. All bit shift operations must be unsigned values, otherwise when working with negative numbers, undefined behavior results — user3629249, Jan 10 '20 at 04:11

score 3 · Answer 1 · answered Jan 09 '20 at 11:34

Even on twos-complement machines, the behaviour of right-shift (>>) on negative operands is implementation-defined.

A safer approach is to work with unsigned types and explicitly OR-in the MSB.

While you're at it, you probably also want to use fixed-width types (e.g. uint32_t) rather than failing on platforms that don't meet your expectations.

score 1 · Accepted Answer · answered Jan 09 '20 at 11:56

0x80000000 is treated as an unsigned number which causes everything to be converted to unsigned, You can do this:

// xor MSB and 2nd MSB. if diferent, we have an overflow and SHOULD get 0xFFFFFFFF. otherwise we get 0.
int overflowmask = ((x & (0x40000000 << 1)) ^ ((x & 0x40000000)<<1)) >>31;

// this gets you Tmin if x < 0 or Tmax if x >= 0
int overflowreplace = ((x>>31)^0x7FFFFFFF);

// if overflow, return x*2, otherwise overflowreplace
return ((x<<1) & ~overflowmask)|(overflowreplace & overflowmask);

OR write the constants in negative decimals

OR I would store all the constants in const int variables to have them guaranteed signed.

Still the right shift remains implementation-defined and is not guaranteed to be arithmetic shift. — Lundin, Jan 09 '20 at 12:05
A slightly shorter solution is here: https://stackoverflow.com/a/29285223/378360 — Falk Hüffner, Jan 10 '20 at 21:57

score 1 · Answer 3 · answered Jan 09 '20 at 12:04

1

Never use bit-wise operands on signed types. In case of right shift on signed integers, it is up to the compiler if you get an arithmetic or a logical shift.

That's only one of your problems though. When you use a hex integer constant 0x80000000, it is actually of type unsigned int as explained here. This accidentally turns your whole expression (x & 0x80000000) ^ ... into unsigned type because of the integer promotion rule known as "the usual arithmetic conversions". Whereas the 0x40000000 expression is signed int and works as (the specific compiler) expected.

Solution:

All variables involved must be of type uint32_t.
All hex constants involved must be u suffixed.
To get something arithmetic shift portably, you would have to do
(x >> n) | (0xFFFFFFFFu << (32-n)) or some similar hack.

answered Jan 09 '20 at 12:04

Lundin

195,001
40
254
396

Hex literals are unsigned *only if* too big for `int`. Whilst it appears that the question assumes `int` is a 32-bit 2s-complement type, we answerers ought to be a bit more circumspect about making such assumptions! – Toby Speight Jan 09 '20 at 14:01
@TobySpeight It doesn't really matter because even if int is 16 bit, the hex literals would end up long/unsigned long respectively after the very same rules. – Lundin Jan 09 '20 at 14:26
It does matter, if `int` is 33 bits or more. – Toby Speight Jan 09 '20 at 15:02
@TobySpeight It is not, so you don't need to worry. If my code breaks on a highly fictional system that uses > 32 bit ints, I'm happy. The solution is not to fix the code, but to not use that compiler in the first place. You can also add `_Static_assert(sizeof(int) < 5, "Stop using crap.");`. – Lundin Jan 09 '20 at 15:06
Ah, there's the difference. When my code breaks, it makes me *un*happy. – Toby Speight Jan 09 '20 at 15:09
@TobySpeight You better not compile this on a 65 bit CPU with the `crapc -std=garage` compiler then... – Lundin Jan 09 '20 at 15:18

arithmetic right shift shifts in 0s when MSB is 1

3 Answers3