Computing integer absolute differences in overflow-safe ways?

Question

I would like to compute the absolute difference of two integers. Naively, this is just abs(a - b). However, this has a couple of problems, depending on whether the integers are signed or unsigned:

For unsigned integers, a - b will be a large positive number if b is larger than a, and the absolute value operation will not fix that.
For signed integers, a - b may be outside of the range of values that can be represented as a signed integer, thus leading to undefined behavior. (Obviously, this means that the result will need to be an unsigned integer.)

How can these problems be addressed in an efficient way?

I would like an algorithm (or pair of algorithms) that works for both signed and unsigned integers, and does not rely on casting the values to a larger integer size.

(Also, to clarify: my question is not how to write this in code, but what exactly I should be writing in order to guarantee overflow-safety. Computing abs(a - b) as a signed value and then casting to an unsigned value doesn't work, because signed integer overflow is typically an undefined operation, at least in C.)

Brooks Moses · Accepted Answer · 2011-11-06T01:16:10.863

(I've been working this out on my own after asking the question -- I thought it would be harder, and I'd still welcome other answers if there are better ones!)

The solution for unsigned integers is relatively straightforward (as described in Jack Toole's answer), and works by moving the (implied) conditional outside the subtraction so that we are always subtracting the smaller number from the larger one, and are not comparing a potentially-wrapped value to zero:

if (a > b)
  return a - b;
else
  return b - a;

This just leaves the question of signed integers. Consider the following variation:

if (a > b)
  return (unsigned) a - (unsigned) b;
else
  return (unsigned) b - (unsigned) a;

We can easily prove that this works by using modulo arithmetic. We know that a and (unsigned) a are congruent modulo UINT_MAX. Further, the unsigned-integer subtraction operation is congruent to actual subtraction modulo UINT_MAX, so combining these facts, we know that (unsigned) a - (unsigned) b is congruent to the actual value of a - b modulo UINT_MAX. Finally, because we know that the actual value of a - b must be between 0 and UINT_MAX-1, we know that this is an exact equality.

Aria Buckles · Answer 2 · 2011-11-06T02:33:27.873

Edited to use Brook Moses's fix for signed types and Kerrek SB's make_unsigned to allow for template use.

First, you can either have the following for make_unsigned, or you can use std::make_unsigned , tr1::make_unsigned , or BOOST::make_unsigned.

template <typename T> struct make_unsigned { };

template <> struct make_unsigned<bool              > {};
template <> struct make_unsigned<  signed short    > {typedef unsigned short     type;};
template <> struct make_unsigned<unsigned short    > {typedef unsigned short     type;};
template <> struct make_unsigned<  signed int      > {typedef unsigned int       type;};
template <> struct make_unsigned<unsigned int      > {typedef unsigned int       type;};
template <> struct make_unsigned<  signed long     > {typedef unsigned long      type;};
template <> struct make_unsigned<unsigned long     > {typedef unsigned long      type;};
template <> struct make_unsigned<  signed long long> {typedef unsigned long long type;};
template <> struct make_unsigned<unsigned long long> {typedef unsigned long long type;};

Then, the template definition becomes a simple:

template <typename T>
typename make_unsigned<T>::type absdiff(T a, T b)
{
    typedef typename make_unsigned<T>::type UT;
    if (a > b)
        return static_cast<UT>(a) - static_cast<UT>(b);
    else
        return static_cast<UT>(b) - static_cast<UT>(a);
}

As Brooks Moses explains, this wraparound is safe.

Right, but how does this address the possiblity of signed integer overflow when, say, `a` is INT_MAX and `b` is INT_MIN? — Brooks Moses, Nov 06 '11 at 00:59
It's not an absolute difference and is still vulnerable to overflow for signed integers. — Jakub Wieczorek, Nov 06 '11 at 00:59
Okay, I misunderstood the question. In that case, we'd want to always return an unsigned type? — Aria Buckles, Nov 06 '11 at 01:01
@JackToole: Yes, let me try to clarify the question. You're right that we need to always return an unsigned type. — Brooks Moses, Nov 06 '11 at 01:07

score 1 · Answer 3 · edited Nov 06 '11 at 01:59

1

Here's an idea: if we're unsigned, we just take the correct difference. If we're signed, we return the corresponding unsigned type:

#include <type_traits>

template <typename T, bool> struct absdiff_aux;

template <typename T> struct absdiff_aux<T, true>
{
  static T absdiff(T a, T b)  { return a < b ? b - a : a - b; }
};

template <typename T> struct absdiff_aux<T, false>
{
  typedef typename std::make_unsigned<T>::type UT;
  static UT absdiff(T a, T b)
  {
    if ((a > 0 && b > 0) || (a <= 0 && b <= 0))
      return { a < b ? b - a : a - b; }

    if (b > 0)
    {
      UT d = -a;
      return UT(b) + d
    }
    else
    {
      UT d = -b;
      return UT(a) + d;
    }
  }
};

template <typename T> typename std::make_unsigned<T>::type absdiff(T a, T b)
{
  return absdiff_aux<T, std::is_signed<T>::value>::absdiff(a, b);
}

Usage: int a, b; unsigned int d = absdiff(a, b);

edited Nov 06 '11 at 01:59

Brooks Moses

9,267
2
33
57

answered Nov 06 '11 at 01:12

Kerrek SB

464,522
92
875
1,084

+1, I was looking for a way to make things unsigned with templates, but unfortunately haven't been able to upgrade to C++11 yet. There might be a way to do this without as much SFINAE, by subtracting std::numeric_limits::min() and casting to UT. – Aria Buckles Nov 06 '11 at 01:21
@JackToole: Don't give up believing in yourself! Just because `make_signed` is only part of the new standard doesn't mean you can't write something just like it yourself. (Or cheat and take `boost::make_signed`.) :-) – Kerrek SB Nov 06 '11 at 01:23
Pedantically, that first `if` in the signed case needs to be checking `a>=0 && b>=0 || a<0 && b<0` to lump zeros in with the positive numbers -- otherwise, you could get overflow in the case where one is `INT_MIN` and the other is 0, because `0 - INT_MIN` is `INT_MAX + 1`. (This stuff is tricky!) Otherwise, it looks correct to me -- although it does take at least three conditionals and two branches for each value it computes, which seems somewhat inefficient. – Brooks Moses Nov 06 '11 at 01:55
@BrooksMoses: It's entirely possible that I missed the INT_MIN special case, so please do feel free to add that. Dealing with signed numbers *is* tricky, so if you have any ideas for eliminating redundancies, do say. I just made this up on the spot, aiming for the biggest problem of the difference being too large for a signed type. – Kerrek SB Nov 06 '11 at 01:57
@KerrekSB Yep :). I cheated and found an implementation, but it's not hard to write. Just hard to think of. Templates are awesome and can do all sorts of neat tricks, but I still need more practice to know how to do the advanced tricks. – Aria Buckles Nov 06 '11 at 02:39

Alexey Frunze · Answer 4 · 2011-11-06T02:19:25.663

Code:

#include <stdio.h>
#include <limits.h>

unsigned AbsDiffSigned(int a, int b)
{
  unsigned diff = (unsigned)a - (unsigned)b;
  if (a < b) diff = 1 + ~diff;
  return diff;
}

unsigned AbsDiffUnsigned(unsigned a, unsigned b)
{
  unsigned diff = a - b;
  if (a < b) diff = 1 + ~diff;
  return diff;
}

int testDataSigned[] =
{
  { INT_MIN },
  { INT_MIN + 1 },
  { -1 },
  { 0 },
  { 1 },
  { INT_MAX - 1 },
  { INT_MAX }
};

unsigned testDataUnsigned[] =
{
  { 0 },
  { 1 },
  { UINT_MAX / 2 + 1 },
  { UINT_MAX - 1 },
  { UINT_MAX }
};

int main(void)
{
  int i, j;

  printf("Absolute difference of signed integers:\n");

  for (j = 0; j < sizeof(testDataSigned)/sizeof(testDataSigned[0]); j++)
    for (i = 0; i < sizeof(testDataSigned)/sizeof(testDataSigned[0]); i++)
      printf("abs(%d - %d) = %u\n",
             testDataSigned[j],
             testDataSigned[i],
             AbsDiffSigned(testDataSigned[j], testDataSigned[i]));

  printf("Absolute difference of unsigned integers:\n");

  for (j = 0; j < sizeof(testDataUnsigned)/sizeof(testDataUnsigned[0]); j++)
    for (i = 0; i < sizeof(testDataUnsigned)/sizeof(testDataUnsigned[0]); i++)
      printf("abs(%u - %u) = %u\n",
             testDataUnsigned[j],
             testDataUnsigned[i],
             AbsDiffUnsigned(testDataUnsigned[j], testDataUnsigned[i]));
  return 0;
}

Output:

Absolute difference of signed integers:
abs(-2147483648 - -2147483648) = 0
abs(-2147483648 - -2147483647) = 1
abs(-2147483648 - -1) = 2147483647
abs(-2147483648 - 0) = 2147483648
abs(-2147483648 - 1) = 2147483649
abs(-2147483648 - 2147483646) = 4294967294
abs(-2147483648 - 2147483647) = 4294967295
abs(-2147483647 - -2147483648) = 1
abs(-2147483647 - -2147483647) = 0
abs(-2147483647 - -1) = 2147483646
abs(-2147483647 - 0) = 2147483647
abs(-2147483647 - 1) = 2147483648
abs(-2147483647 - 2147483646) = 4294967293
abs(-2147483647 - 2147483647) = 4294967294
abs(-1 - -2147483648) = 2147483647
abs(-1 - -2147483647) = 2147483646
abs(-1 - -1) = 0
abs(-1 - 0) = 1
abs(-1 - 1) = 2
abs(-1 - 2147483646) = 2147483647
abs(-1 - 2147483647) = 2147483648
abs(0 - -2147483648) = 2147483648
abs(0 - -2147483647) = 2147483647
abs(0 - -1) = 1
abs(0 - 0) = 0
abs(0 - 1) = 1
abs(0 - 2147483646) = 2147483646
abs(0 - 2147483647) = 2147483647
abs(1 - -2147483648) = 2147483649
abs(1 - -2147483647) = 2147483648
abs(1 - -1) = 2
abs(1 - 0) = 1
abs(1 - 1) = 0
abs(1 - 2147483646) = 2147483645
abs(1 - 2147483647) = 2147483646
abs(2147483646 - -2147483648) = 4294967294
abs(2147483646 - -2147483647) = 4294967293
abs(2147483646 - -1) = 2147483647
abs(2147483646 - 0) = 2147483646
abs(2147483646 - 1) = 2147483645
abs(2147483646 - 2147483646) = 0
abs(2147483646 - 2147483647) = 1
abs(2147483647 - -2147483648) = 4294967295
abs(2147483647 - -2147483647) = 4294967294
abs(2147483647 - -1) = 2147483648
abs(2147483647 - 0) = 2147483647
abs(2147483647 - 1) = 2147483646
abs(2147483647 - 2147483646) = 1
abs(2147483647 - 2147483647) = 0
Absolute difference of unsigned integers:
abs(0 - 0) = 0
abs(0 - 1) = 1
abs(0 - 2147483648) = 2147483648
abs(0 - 4294967294) = 4294967294
abs(0 - 4294967295) = 4294967295
abs(1 - 0) = 1
abs(1 - 1) = 0
abs(1 - 2147483648) = 2147483647
abs(1 - 4294967294) = 4294967293
abs(1 - 4294967295) = 4294967294
abs(2147483648 - 0) = 2147483648
abs(2147483648 - 1) = 2147483647
abs(2147483648 - 2147483648) = 0
abs(2147483648 - 4294967294) = 2147483646
abs(2147483648 - 4294967295) = 2147483647
abs(4294967294 - 0) = 4294967294
abs(4294967294 - 1) = 4294967293
abs(4294967294 - 2147483648) = 2147483646
abs(4294967294 - 4294967294) = 0
abs(4294967294 - 4294967295) = 1
abs(4294967295 - 0) = 4294967295
abs(4294967295 - 1) = 4294967294
abs(4294967295 - 2147483648) = 2147483647
abs(4294967295 - 4294967294) = 1
abs(4294967295 - 4294967295) = 0

Could you explain what the `diff = 1 + ~diff` line in the signed function is doing? That seems to be the key piece. — Brooks Moses, Nov 06 '11 at 07:51
Inverting all bits of a 2's complement integer and then adding 1 negates the number, changing positive to negative and vice versa. Examples (with 16-bit unsigned ints): 1+~0=1+0xFFFF=0, 1+~1=1+0xFFFE=0xFFFF(=-1), 1+~0xFFFF(=-1)=1+0=1, 1+~0x8000(=-0x8000)=1+0x7FFF=0x8000(=-0x8000). Basically, my functions first perform 2's complement subtraction of the numbers (just like x86's SUB). The difference is 2's complement too (but it may be missing an extra sign bit, e.g. when you get it equal 0x80...00 in AbsDiffSigned()). You get the sign bit from (a < b) and based on it you negate the difference. — Alexey Frunze, Nov 06 '11 at 08:43
It's not as if anyone seriously uses non-2's-complement.. might as well use unary minus — harold, Nov 06 '11 at 08:47
@harold: The reason why I used `1+~` instead of `-` is that some compilers don't like seeing unary minus in front of unsigned values. As I recall, MSVC++ with warning level set to 4 would do that. Also, I'm not quite sure about any undefined behavior here, although mathematically `-X` should be equivalent to `0-X`, but C's arithmetic is not your normal arithmetic... — Alexey Frunze, Nov 06 '11 at 08:57
@harold: The 2003 C++ standard says in 5.3.1c7: `The negative of an unsigned quantity is computed by subtracting its value from 2^n, where n is the number of bits in the promoted operand.` The 1999 C standard doesn't include such an explicit statement and does not clearly define the unary - behavior neither in 6.5.3.3c1,3 nor in 6.5c4 (`Some operators (the unary operator ~, and the binary operators <<, >>, &, ^, and |, ...) ... return values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types.`). — Alexey Frunze, Nov 06 '11 at 11:02
Good to know, thanks for looking that up - looks like it'll be safe in C++ at least — harold, Nov 06 '11 at 11:08
@harold: and in C too, see [this question](http://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands). — Alexey Frunze, Nov 06 '11 at 20:52

Computing integer absolute differences in overflow-safe ways?

4 Answers4