Absolute value abs(x) using bitwise operators and Boolean logic

Question

How does this work?

The idea is to make abs(x) use bitwise operators for integers (assuming 32 bit words):

y = x >> 31
(x + y) ^ y // This gives abs(x) (is ^ XOR)?

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/209314/discussion-on-question-by-beta-me-me-beta-absolute-value-absx-using-bitwise-op). — Samuel Liew, Mar 09 '20 at 14:24
@cmaster-reinstatemonica actually intention was to learn, how it works — beta_me me_beta, Mar 09 '20 at 14:26
If you like this kind of bit twiddling, the book [Hacker's Delight](https://www.amazon.com/dp/0321842685) (2nd ed) is full of these kinds of bitwise manipulations. (The term "hacker" is used in the older, positive sense.) — Eljay, Mar 09 '20 at 14:28
@beta_meme_beta "intention was to learn, how it works" --> also learn, how it does not work well and how it does not work. — chux - Reinstate Monica, Mar 09 '20 at 14:33
@cmaster-reinstatemonica I'm curious, *which* ISA in common use today has an *integer* absolute value instruction in non-vector registers? Not x86, not ARM. Power only in SPE, xtensa is not exactly mainstream. Other than that, it's floating-point or vector only. Mainly because compilers can generate code that corresponds *exactly* to the OP's code, so the instruction you imagine the CPU to have is not required. — EOF, Mar 09 '20 at 22:11
[get absolute value without using abs function nor if statement](https://stackoverflow.com/q/9772348/995714), https://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs — phuclv, Mar 10 '20 at 02:04
For reference, I have profiled (the similar version) `(x^y)-y` as being faster than `std::abs(x)` on x86-64. — geometrian, Mar 10 '20 at 05:19
@EOF Indeed, I was thinking about the vector instructions. But even X86 has both a negation (like virtually all ISAs) and conditional moves. My main point is: If you know something that's faster than `a >= 0 ? a : -a`, tell your compiler people about it, and they'll change its optimization. — cmaster - reinstate monica, Mar 10 '20 at 07:04
@cmaster-reinstatemonica `std::abs(...)` has native overloads [for floating-point](https://en.cppreference.com/w/cpp/numeric/math/fabs) and [for integral](https://en.cppreference.com/w/cpp/numeric/math/abs) types. — geometrian, Mar 10 '20 at 07:36
@imallett Ah, ok. I take back everything I said and claim the opposite :-) Somehow, those integer overloads escaped my notice, even though I looked for them :-( — cmaster - reinstate monica, Mar 10 '20 at 08:11

score 59 · Accepted Answer · edited Mar 16 '20 at 13:59

59

Assuming 32-bit words, as stated in the question:

For negative x, x >> 31 is implementation-defined in the C and C++ standards. The author of the code expects two’s complement integers and an arithmetic right-shift, in which x >> 31 produces all zero bits if the sign bit of x is zero and all one bits if the sign bit is one.

Thus, if x is positive or zero, y is zero, and x + y is x, so (x + y) ^ y is x, which is the absolute value of x.

If x is negative, y is all ones, which represents −1 in two’s complement. Then x + y is x - 1. Then XORing with all ones inverts all the bits. Inverting all the bits is equivalent to taking the two’s complement and subtracting one, and two’s complement is the method used to negate integers in two’s complement format. In other words, XORing q with all ones gives -q - 1. So x - 1 XORed with all ones produces -(x - 1) - 1 = -x + 1 - 1 = -x, which is the absolute value of x except when x is the minimum possible value for the format (−2,147,483,648 for 32-bit two’s complement), in which case the absolute value (2,147,483,648) is too large to represent, and the resulting bit pattern is just the original x.

edited Mar 16 '20 at 13:59

Konrad Rudolph

530,221
131
937
1,214

answered Mar 09 '20 at 14:18

Eric Postpischil

195,579
13
168
312

So what actually happens here is y becomes -1 i.e. arithmetic shift when x<0, I got it. – beta_me me_beta Mar 09 '20 at 14:32
3

@beta *IF* your implementation of C or C++ adheres to two's complement integers and an arithmetic right-shift. Pre-C++20, integers only were required to represent all values between `-(2^(N-1) - 1)` to `+(2^(N-1) - 1)`, or one's complement. Most compilers did two's complement, because they're not monsters, but it was possible. The C++20 standard requires integers to be two's complement. As Eric says above, right shift on a negative integer is implementation defined, but *likely* arithmetic right shift. – JohnFilleau Mar 09 '20 at 15:18
3

Actually, the type of Y is not specified, you missed a very important part of this algorithm, which is "if Y is 32 bit unsigned, this still works". That's because it uses creative mappings of bits, leveraging the definition of 2's complements numbers, which works without regard to the sign-ness of the system. It's an old Numberical Recipies trick, which actually (at that time) improved some aspects of portability, in the kinds of ways we now wish nobody "improved" in this manner. – Edwin Buck Mar 09 '20 at 15:49
For others who wonder how I noticed it was a numerical recipies approach, the heavy reuse of registers is a good hint. Such bit mongering was common practice in the mostly assembly only days of programming, and a lot of optimizations like these were ported into C, which then had them spill over to C++ / Java / etc. Today, these techniques are mostly wrapped under nice API calls, but occasionally you'll see someone "optimize" in a solution like this one. – Edwin Buck Mar 09 '20 at 15:54
@John: ISO C, and C++ before 20, allow 3 choices of signed integer representation: 2's complement, 1's complement, and sign/magnitude. Does C++20 also require arithmetic right shifts, or did they still leave the language woefully under-standardized? Omitting bit-scan and popcount are bad enough, but no fully standard way to get an arithmetic right shift is just ridiculous when every(?) mainstream compiler makes the sane choice. – Peter Cordes Mar 09 '20 at 18:49
3

@EdwinBuck: yup, these days optimizing compilers know this trick, and will use it when appropriate if you write `unsigned absx = x<0 ? -x : x;` (Or something better that avoids signed-overflow UB in `-x` for the most-negative 2's complement number). – Peter Cordes Mar 09 '20 at 18:51

Aykhan Hagverdili · Answer 2 · 2020-03-09T14:27:49.613

18

This approach relies on many implementation specific behavior:

It assumes that x is 32 bits wide. Though, you could fix this by x >> (sizeof(x) * CHAR_BIT - 1)
It assumes that the machine uses two's complement representation.
the right-shift operator copies the sign bit from left to right.

Example with 3 bits:

101 -> x = -3
111 -> x >> 2

101 + 111 = 100 -> x + y

100 XOR 111 -> 011 -> 3

This is not portable.

edited Mar 09 '20 at 14:27

answered Mar 09 '20 at 14:06

Aykhan Hagverdili

28,141
6
41
93

2

I am not one of them, but this doesn't answer OP questions. Only comments do – solid.py Mar 09 '20 at 14:09
Not my DV, and I haven't read the C++20 standard, but under C (also currently tagged), the right shifting of a negative signed value is implementation-defined, so there are other issues with this approach. – Andrew Henle Mar 09 '20 at 14:09
@AndrewHenle I'll mention that. – Aykhan Hagverdili Mar 09 '20 at 14:10
C++20 may mandate twos complement, but not mandate a 32-bit `int`. [And, no, I didn't downvote, but I'm unconvinced this answers the question] – Peter Mar 09 '20 at 14:10
@AndrewHenle There is no point in answer, no one seems to know if this is standard behavior or not. all I am seeing is contradicting comments. – solid.py Mar 09 '20 at 14:10
3

In the OP question, the `xor` operation is with `y` (not `x`) – BiagioF Mar 09 '20 at 14:11
@hitter: [this comment](https://stackoverflow.com/questions/60601908/absolute-value-absx-using-bitwise-operators-and-boolean-logic/60602190?noredirect=1#comment107215040_60601908) is authoritative. – Robert Harvey Mar 09 '20 at 14:11
3

The code does not assume `int` is 32 bits because there is no `int` in the question at all. And it explicitly states the words it is working with, whatever type they are, are 32 bits. – Eric Postpischil Mar 09 '20 at 14:26
1

@EricPostpischil Thank you for the feedback. Is this rephrasing better? – Aykhan Hagverdili Mar 09 '20 at 14:28
Nice answer. Maybe worth to mention that CPUS may offer [logical shift](https://en.wikipedia.org/wiki/Logical_shift)(not preserving the sign bit) and [arithmetic shifts](https://en.wikipedia.org/wiki/Arithmetic_shift) preserving sign bit and C++ standard leaves it open which one is used, which make the result implementation defined for negatives. Maybe worth to note that [MISRA 12.7](https://www.misra.org.uk/forum/viewtopic.php?t=562) (safe coding) forbids shifting signed numbers because of this uncertainty – Christophe Mar 09 '20 at 23:00

Edwin Buck · Answer 3 · 2020-03-09T19:31:39.633

This isn't portable, but I'll explain why it works anyway.

The first operation exploits a trait of 2's complement negative numbers, that the first bit if 1 if negative, and 0 if positive. This is because the numbers range from

The example below is for 8 bits, but can be extrapolated to any number of bits. In your case it's 32 bits (but 8 bits displays the ranges more easily)

10000000 (smallest negative number)
10000001 (next to smallest)
...
11111111 (negative one)
00000000 (zero)
00000001 (one)
...
01111110 (next to largest)
01111111 (largest)

Reasons for using 2's complement encoding of numbers come about by the property that adding any negative number to it's positive number yields zero.

Now, to create the negative of a 2's complement number, you would need to

Take the inverse (bitwise not) of a the input number.
Add one to it.

The reason the 1 is added to it is to force the feature of the addition zeroing the register. You see, if it was just x + ~(x), then you would get a register of all 1's. By adding one to it, you get a cascading carry which yields a register of zeros (with a 1 in the carry out of the register).

This understanding is important to know "why" the algorithm you provided (mostly) works.

y = x >> 31   // this line acts like an "if" statement.
              // Depending on if y is 32 signed or unsigned, when x is negative, 
              // it will fill y with 0xFFFFFFFF or 1.  The rest of the 
              // algorithm doesn't, care because it accommodates both inputs.
              // when x is positive, the result is zero.

We will explore (x is positive first)

(x + y) ^ y   // for positive x, first we substitute the y = 0
(x + 0) ^ 0   // reduce the addition
(x) ^ 0       // remove the parenthesis
x ^ 0         // which, by definition of xor, can only yield x
x

Now let's explore (x is negative, y is 0xFFFFFFFF (y was signed))

(x + y) ^ y   // first substitute the Y
(x + 0xFFFFFFFF) ^ 0xFFFFFFFF // note that 0xFFFFF is the same as 2's complement -1
(x - 1) ^ 0xFFFFFFFF // add in a new variable Z to hold the result
(x - 1) ^ 0xFFFFFFFF = Z  // take the ^ 0xFFFFFFFF of both sides
(x - 1) ^ 0xFFFFFFFF ^ 0xFFFFFFFF = Z ^ 0xFFFFFFFF // reduce the left side
(x - 1) = z ^ 0xFFFFFFFF // note that not is equivalent to ^ 0xFFFFFFFF
(x - 1) = ~(z) // add one to both sides
x - 1 + 1 = ~(z) + 1 //  reduce
x = ~(z) + 1  // by definition z is negative x (for 2's complement numbers)

Now let's explore (x is negative, y is 0x01 (y was unsigned))

(x + y) ^ y   // first substitute the Y
(x + 1) ^ 0x00000001 // note that x is a 2's complement negative, but is
                     // being treated as unsigned, so to make the unsigned
                     // context of x tracable, I'll add a -(x) around the X
(-(x) + 1) ^ 0x00000001 // which simplifies to
(-(x - 1)) ^ 0x00000001 // negative of a negative is positive
(-(x - 1)) ^ -(-(0x00000001)) // substituting 1 for bits of -1
(-(x - 1)) ^ -(0xFFFFFFFF) // pulling out the negative sign
-((x-1) ^ 0xFFFFFFFF) // recalling that while we added signs and negations to
                      // make the math sensible, there's actually no place to
                      // store them in an unsigned storage system, so dropping
                      // them is acceptable
x-1 ^ 0XFFFFFFFF = Z // introducing a new variable Z, take the ^ 0xFFFFFFF of both sides
x-1 ^ 0xFFFFFFFF ^ 0xFFFFFFFF = Z ^ 0xFFFFFFFF // reduce the left side
x-1 = z ^ 0xFFFFFFFF // note that not is equivalent to ^ 0xFFFFFFFF
x-1 = ~(z) // add one to both sides
x - 1 + 1 = ~(z) + 1 //  reduce
x = ~(z) + 1  // by definition z is negative x (for 2's complement numbers, even though we used only non-2's complement types)

Note that while the above proofs are passable for a general explanation, the reality is that these proofs don't cover important edge cases, like x = 0x80000000 , which represents a negative number greater in absolute value than any positive X which could be stored in the same number of bits.

Last paragraph: `x = 0xFFFFFFFF` is -1. I think you meant `x = 0x80000000`, the most-negative number special case for 2's complement. You can handle that by treating the abs result as unsigned. (And writing it carefully to avoid the possibility of C signed-overflow UB, e.g. by doing the `-` on unsigned to avoid overflow from 0x7FFFFFFF to 0x80000000) — Peter Cordes, Mar 09 '20 at 17:01
Your notation is odd; you seem to be using `!` for bitwise inverse (one's complement). But this is a C question; the C operator for that is `~` (tilde). `!` in C is logical not which can only produce 0 or 1. — Peter Cordes, Mar 09 '20 at 17:02
@PeterCordes I missed the first comment, I'm fixing that one too. Thank you for the careful reading. — Edwin Buck, Mar 09 '20 at 19:30
I held off on upvoting until the problems I noted were fixed. Looks good now. Some nice discussion of how carry-propagation flips bits to turn -1 into 0. — Peter Cordes, Mar 09 '20 at 19:52

schnedan · Answer 4 · 2020-03-11T23:13:21.833

I use this code, first the calculation of the two's complement (the guard just ensures with a compile time check, the template is an Integer)

/**
 * Zweierkomplement - Two's Complement
 */
template<typename T> constexpr auto ZQ(T const& _x) noexcept ->T{
  Compile::Guards::IsInteger<T>();
  return ((~(_x))+1);
}

and in a second step this is used to calculate the integer abs()

/**
 * if number is negative, get the same number with positiv sign
 */
template<typename T> auto INTABS(T const _x) -> typename std::make_unsigned<T>::type{
  Compile::Guards::IsInteger<T>();
  return static_cast<typename std::make_unsigned<T>::type>((_x<0)?(ZQ<T>(_x)):(_x));
}

why I use this kind of code:
* compile-time checks
* works with all Integer sizes
* portable from small µC to modern cores
* Its clear, that we need to consider the two's complement, so you need an unsigned return value, e.g for 8bit abs(-128)=128 can not be expressed in an signed integer

Absolute value abs(x) using bitwise operators and Boolean logic

4 Answers4