What do these 3 lines of C code do by themselves, and what do they do altogether?

Question

int mystery(int x) {
  int mask = x >> 31;
  return (x ^ mask)
    + ~mask + 1L;

I believe the first line creates a mask from x, such that it is all 1s if the most significant bit is 1, and all 0s if the MSB is 0.

The second line XORs the mask with the original x, which flips all the bits if the mask is 1s, and does nothing if the mask is all 0s.

Then the third line adds the complement of the mask, and also adds 1L... this is where I don't understand.

So my question is, what does the 3rd line do specifically, particularly the 1L? And what does the entire function do to x?

@op... don't delete the question. Rolled back – pmg Oct 20 '19 at 09:10 — pmg, Oct 20 '19 at 09:10

S.S. Anne · Answer 1 · 2019-10-13T22:46:36.907

3

This is returning the absolute value of a number without branching on a two's complement machine. However, there is one important exception: if x is originally INT_MIN, this will return INT_MIN.

Let's take a number -3 as an example and step through this.

int mask = x >> 31; defines a variable called mask that is -1 (or all bit if x was originally negative (not portable!) and 0 otherwise. With x as -3 mask is -1.
(x ^ mask) is the value of x with all bits flipped if x was originally negative. The result of this expression is 2 if x was originally -3.
+ ~mask + 1L; adds the result of the above to 1 if x was negative or 0 otherwise and returns it. The result would be 3 if x was originally -3.

To explain this step further, let's consider when mask is -1. The ~ will flip all the bits to 0. After that, adding 1 causes this to, well, add 1 to the result.

Considering when mask is 0, the ~ flips all the bits to 1 (which is -1) and then adds 1, so the result would be no change.

Let's also step through the INT_MIN scenario:

mask is -1.
flip all bits of INT_MIN is INT_MAX.
adding 1 to INT_MAX is INT_MIN (and undefined behavior!!!).

Now let's see what happens with a positive x. Using 5 as an example:

mask is 0.
x is left unchanged.
x is again left unchanged.

Unless you're using a compiler that defines all of this behavior, this won't work. I highly suggest against using this.

edited Oct 13 '19 at 22:46

answered Oct 13 '19 at 22:01

S.S. Anne

15,171
8
38
76

`x` is signed, so `x>>31` on a negative number propagates the sign to the right 31 bits, resulting in a mask of 0xFFFFFFFF (assuming 32 bit ints). Basically, mask will be -1 for negative x and 0 for zero/positive x. – Bob Jarvis - Слава Україні Oct 13 '19 at 22:09
@BobJarvis It does? On what compiler? – S.S. Anne Oct 13 '19 at 22:10
Online GDB shows it that way. I'm dumping each of the sub-stage calculations, and `mask` is consistently 0xFFFFFFFF for negative `x`. – Bob Jarvis - Слава Україні Oct 13 '19 at 22:12
1

Your 3rd line is incorrect. If `mask` is all 1s or all 0s, `~mask` will be either -1 or 0, not 0 or 1 like you imply. See my answer. – Kevin Montambault Oct 13 '19 at 22:12
1

"3. adding 1 to `INT_MAX` is `INT_MIN`." - karma. :-) – Bob Jarvis - Слава Україні Oct 13 '19 at 22:35

Kevin Montambault · Answer 2 · 2019-10-13T22:14:09.463

0

The function returns the absolute value of a number.

(x ^ mask) will flip the bits of the number if mask is all 1's (the number is initially negative).

~mask + 1L; will evaluate to either -1 + 1 or 0 + 1 depending on if mask is all 0s or 1s respectively.

Putting it all together, this means that when the number is negative, we flip the bits and add 1. When the number is positive, we don't flip the bits and add 0 (keep it the same). This converts a negative number to a positive number because of how computers store negative numbers. You can read further about this by looking up "two's compliment"

edited Oct 13 '19 at 22:14

answered Oct 13 '19 at 22:09

Kevin Montambault

573
2
8

2

Except, no it doesn't. Consider passing -2147483648 (0x80000000). In this case mask becomes 0xFFFFFFFF, `x ^ mask` then gives 0x7FFFFFFF, `~mask` is 0, so the calculation returns 0x7FFFFFF + 0 + 1, which gives 0x8000000, or in other words it returns its original value. So it returns the absolute value of its argument EXCEPT when the input argument is 0x80000000, in which case it returns its argument. This is probably an unintended bug. :-) Also - tested on OnlineGDB, and after thinking about this for a second, it appears to be ignoring the overflow-to-negative. Hmmm... – Bob Jarvis - Слава Україні Oct 13 '19 at 22:22
@BobJarvis hmm nice catch. – Kevin Montambault Oct 13 '19 at 22:35
I'd love to claim that I saw through the code instantly and recognized the problem, but it was pure dumb luck. :-) – Bob Jarvis - Слава Україні Oct 13 '19 at 22:37

LSerni · Answer 3 · 2019-10-13T22:24:20.023

Assuming x is a 32-bit integer with the sign in the 32th (leftmost) bit,

 int mask = x >> 31;

"mask" holds now either -1 (if x was negative) or 0. If it is -1, its binary representation is 0xffffffff (all 1's).

 return (x ^ mask)

So x^mask is x XORed with either all 0s (remaining unchanged) or all 1's, including sign, which will invert its sign and map it to the positive value one less than the original. So 42 will remain 42, but -42 will become 41.

          + ~mask

not-mask will be 0 if x is negative, or all 1's, i.e. 0xffffffff (hence -1) if x was positive. Adding it, 42 will yield 41, and -42 (which was transformed to 41) will remain 41. Note that now in both cases we have the same number.

      + 1L;

Now, 42 (which was changed to 41) will become back 42, while -42 (which was 41) will become 42. The L is superfluous, as 1 alone will be taken as an int.

All in all, the function will return a value if it's positive, its opposite (i.e. its absolute value) if not.

NOTE: this function will obviously fail if its argument is the minimum (maximally negative) number, as that number has no positive representation in the same space as integers. So, mystery(-2147483647) will yield 2147483647 as expected, but mystery(-2147483648) will yield -2147483648.

Chances are very good that you'll fare better by using abs() instead of your mystery function, which could be an abs implementation for some architectures, while abs tends to be the best implementation for each architecture it's compiled for.

score 0 · Answer 4 · answered Oct 13 '19 at 22:27

I believe it's supposed to return the absolute value of its argument, but it has a bug: if the argument passed in is 0x8000000 then it returns its argument. Test code:

#include <stdio.h>

int mystery(int x) {
  printf("x=%d (0x%X)\n", x, x);

  int mask = x >> 31;

  printf("mask=%d (0x%X)\n", mask, mask);
  printf("(x ^ mask)=0x%X\n", x ^ mask);
  printf("~mask=0x%X\n", ~mask);

  int retval = (x ^ mask) + ~mask + 1L;

  printf("retval=%d (0x%X)\n\n", retval, retval);

  return retval;
}

int main()
{
mystery(1);
mystery(0);
mystery(-1);
mystery(0x80000000);
}

All I've done is break up the calculation and print the intermediate results. For the calls shown this produces:

x=1 (0x1)
mask=0 (0x0)
(x ^ mask)=0x1
~mask=0xFFFFFFFF
retval=1 (0x1)

x=0 (0x0)
mask=0 (0x0)
(x ^ mask)=0x0
~mask=0xFFFFFFFF
retval=0 (0x0)

x=-1 (0xFFFFFFFF)
mask=-1 (0xFFFFFFFF)
(x ^ mask)=0x0
~mask=0x0
retval=1 (0x1)

x=-656 (0xFFFFFD70)
mask=-1 (0xFFFFFFFF)
(x ^ mask)=0x28F
~mask=0x0
retval=656 (0x290)

x=-2147483648 (0x80000000)
mask=-1 (0xFFFFFFFF)
(x ^ mask)=0x7FFFFFFF
~mask=0x0
retval=-2147483648 (0x80000000)

What do these 3 lines of C code do by themselves, and what do they do altogether?

4 Answers4