20

I've come across this code, which incidentally my profiler reports as a bottleneck:

#include <stdlib.h>

unsigned long a, b;
// Calculate values for a and b
unsigned long c;

c = abs(a - b);

Does that line do anything more interesting that c = a - b;? Do either of the options invoke undefined or implementation-defined behaviour, and are there other potential gotchas? Note that the C <stdlib.h> is included, not <cstdlib>.

Mohit Jain
  • 30,259
  • 8
  • 73
  • 100
Ken Y-N
  • 14,644
  • 21
  • 71
  • 114
  • Since the values are unsigned, they can't be negative, so the `abs()` call doesn't make sense. However, `abs()` is overloaded in C++ and could be a macro, too, so what it does is not 100% clear to me. It could even be a function taking `signed long`, which then does something depending on the MSB of the argument. What's the reason to take the absolute value of something that can't be negative? If there is none, just remove the `abs()` call. – Ulrich Eckhardt Jan 08 '15 at 07:26
  • 3
    Potential conversion of unsigned to int that can not represent the value is undefined behavior. That conversion is also the only case where that abs would make difference. – Öö Tiib Jan 08 '15 at 07:59
  • @UlrichEckhardt If `a = 10UL;` and `b = 30UL;`, the correct answer should be `20UL`, but dropping the `abs` would give me the wrong answer. – Ken Y-N Jan 08 '15 at 08:52
  • 2
    Before `abs()` is called, `10UL - 30UL` yields `4294967276UL` (assuming a 32-bit unsigned long). If `abs()` does with that what it suggests (see my remarks concerning macros/overloads), the result is also `4294967276UL`. If you want to have 20, you must use signed arithmetic, like e.g. `abs(static_cast(a) - static_cast(b))`. You have to make sure that this doesn't wrap around though, as that yields "undefined behaviour". In summary, I'd suggest you use Mohit Jain's suggestion, i.e. first check which one is bigger. – Ulrich Eckhardt Jan 08 '15 at 20:41
  • 1
    For the C++ case this may [become ill-formed in the future](http://stackoverflow.com/q/29750946/1708801). – Shafik Yaghmour Jun 03 '15 at 15:32

3 Answers3

28

No, it doesn't make sense.

If you want the difference, use

c = (a > b) ? a - b : b - a;

or

c = max(a, b) - min(a, b);

Unsigned if go below zero would wrap back (effect is similar to adding 2sizeof (unsigned long) * CHAR_BIT)

If you are looking for difference between two numbers, you can write a small template as below

namespace MyUtils {
  template<typename T>
  T diff(const T&a, const T&b) {
    return (a > b) ? (a - b) : (b - a);
  }
}

Looking at the declaration of abs inherited from C (Because you included stdlib.h)

int       abs( int n );
long      abs( long n );
long long abs( long long n ); //    (since C++11)
//Defined in header <cinttypes>
std::intmax_t abs( std::intmax_t n ); //    (since C++11)

And abs in C++ (from cmath)

float       abs( float arg );
double      abs( double arg );
long double abs( long double arg );

If you notice, both the argument and return type of each function are signed. So if you pass an unsigned type to one of these function, implicit conversion unsigned T1 -> signed T2 -> unsigned T1 would take place (where T1 and T2 may be same and T1 is long in your case). When you convert an unsigned integral to signed integral, the behavior is implementation dependendent if it can not be represented in a signed type.

From 4.7 Integral conversions [conv.integral]

  1. If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note]
  2. If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
Christopher Oezbek
  • 23,994
  • 6
  • 61
  • 85
Mohit Jain
  • 30,259
  • 8
  • 73
  • 100
  • Note that while it won't work for `unsigned long` or `unsigned int`, it WILL work for `unsigned char` and `unsigned short` (except on pathological systems which nobody uses). – Dietrich Epp Jan 08 '15 at 05:13
  • I don't see how a system where sizeof(short)==sizeof(int) would be much more pathological than a system where sizeof(int)==sizeof(long) (just looking for a chance to poke fun at 'the man') – dwn Jan 08 '15 at 05:32
  • `s/sizeof(unsigned long)/sizeof(unsigned long) * CHAR_BIT` – The Paramagnetic Croissant Jan 08 '15 at 07:18
9

I don't know whether you'd regard it as making sense, but abs() applied to an unsigned value can certainly return a value other than the one passed in. That's because abs() takes an int argument and returns an int value.

For example:

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    unsigned u1 = 0x98765432;
    printf("u1 = 0x%.8X; abs(u1) = 0x%.8X\n", u1, abs(u1));
    unsigned long u2 = 0x9876543201234567UL;
    printf("u2 = 0x%.16lX; abs(u2) = 0x%.16lX\n", u2, labs(u2));
    return 0;
}

When compiled as C or C++ (using GCC 4.9.1 on Mac OS X 10.10.1 Yosemite), it produces:

u1 = 0x98765432; abs(u1) = 0x6789ABCE
u2 = 0x9876543201234567; abs(u2) = 0x6789ABCDFEDCBA99

If the high bit of the unsigned value is set, then the result of abs() is not the value that was passed to the function.

The subtraction is merely a distraction; if the result has the most significant bit set, the value returned from abs() will be different from the value passed to it.


When you compile this code with C++ headers, instead of the C headers shown in the question, then it fails to compile with ambiguous call errors:

#include <cstdlib>
#include <iostream>
using namespace std;

int main(void)
{
    unsigned u1 = 0x98765432;
    cout << "u1 = 0x" << hex << u1 << "; abs(u1) = 0x" << hex << abs(u1) << "\n";
    unsigned long u2 = 0x9876543201234567UL;
    cout << "u2 = 0x" << hex << u2 << "; abs(u2) = 0x" << hex << abs(u2) << "\n";
    return 0;
}

Compilation errors:

absuns2.cpp: In function ‘int main()’:
absuns2.cpp:8:72: error: call of overloaded ‘abs(unsigned int&)’ is ambiguous
     cout << "u1 = 0x" << hex << u1 << "; abs(u1) = 0x" << hex << abs(u1) << "\n";
                                                                        ^
absuns2.cpp:8:72: note: candidates are:
In file included from /usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:72:0,
                 from absuns2.cpp:1:
/usr/include/stdlib.h:129:6: note: int abs(int)
 int  abs(int) __pure2;
      ^
In file included from absuns2.cpp:1:0:
/usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:174:3: note: long long int std::abs(long long int)
   abs(long long __x) { return __builtin_llabs (__x); }
   ^
/usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:166:3: note: long int std::abs(long int)
   abs(long __i) { return __builtin_labs(__i); }
   ^
absuns2.cpp:10:72: error: call of overloaded ‘abs(long unsigned int&)’ is ambiguous
     cout << "u2 = 0x" << hex << u2 << "; abs(u2) = 0x" << hex << abs(u2) << "\n";
                                                                        ^
absuns2.cpp:10:72: note: candidates are:
In file included from /usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:72:0,
                 from absuns2.cpp:1:
/usr/include/stdlib.h:129:6: note: int abs(int)
 int  abs(int) __pure2;
      ^
In file included from absuns2.cpp:1:0:
/usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:174:3: note: long long int std::abs(long long int)
   abs(long long __x) { return __builtin_llabs (__x); }
   ^
/usr/gcc/v4.9.1/include/c++/4.9.1/cstdlib:166:3: note: long int std::abs(long int)
   abs(long __i) { return __builtin_labs(__i); }
   ^

So, the code in the question only compiles when only the C-style headers are used; it doesn't compile when the C++ headers are used. If you add <stdlib.h> as well as <cstdlib>, there's an additional overload available to make the calls more ambiguous.

You can make the code compile if you add (in)appropriate casts to the calls to abs(), and the absolute value of a signed quantity can be different from the original signed quantity, which is hardly surprising news:

#include <cstdlib>
#include <iostream>
using namespace std;

int main(void)
{
    unsigned u1 = 0x98765432;
    cout << "u1 = 0x" << hex << u1 << "; abs(u1) = 0x" << hex << abs(static_cast<int>(u1)) << "\n";
    unsigned long u2 = 0x9876543201234567UL;
    cout << "u2 = 0x" << hex << u2 << "; abs(u2) = 0x" << hex << abs(static_cast<long>(u2)) << "\n";
    return 0;
}

Output:

u1 = 0x98765432; abs(u1) = 0x6789abce
u2 = 0x9876543201234567; abs(u2) = 0x6789abcdfedcba99

Moral: Don't use the C headers for which there are C++ equivalents in C++ code; use the C++ headers instead.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • `abs` is supposed to be an overloaded function in C++ even if `` is included, though. Either the implementation isn't doing that, or `abs(int)` won't be the overload that gets picked. –  Jan 08 '15 at 07:53
  • Not if you only include ``, I think. That only provides you with the standard C functions. Offhand, I don't know which C++ header(s) define overloads for `abs()`. And I don't know if the overloads include unsigned types. – Jonathan Leffler Jan 08 '15 at 07:56
  • `` is specified as containing C's ``'s declarations, except in namespace `std`, plus `long abs(long)` and `long long abs(long long)`. C++'s `` is specified as containing C++'s ``'s declarations, except in the global namespace. (To quote C++11: "Every C header, each of which has a name of the form `name.h`, behaves as if each name placed in the standard library namespace by the corresponding `cname` header is placed within the global namespace scope.") That would include the added overloads. But implementations have bugs, and not all get this right. –  Jan 08 '15 at 08:03
  • If the code in the question had included `` instead of ``, I would not have created a dual-language test program. I quoted the environment I use to allow for bugs to be tracked. I may play around later today to see whether using `` instead of `` makes a difference; I'm not going to call it without experimentation. – Jonathan Leffler Jan 08 '15 at 08:14
-3

I think when c=a-b is negative, if c is unsigned number, c is not the accurate answer. Using abs to guarantee c is a positive number.

Roger.Kang
  • 62
  • 1
  • 5
  • 2
    "not the accurate answer" can happen if you mean (school math) calculations over natural numbers. Unsigned values will wrap around if you exceed their range instead in C++, so it will always be positive. Using `abs()` on it shouldn't change its value then. – Ulrich Eckhardt Jan 08 '15 at 07:29