98

At first glance, this question may seem like a duplicate of How to detect integer overflow?, however it is actually significantly different.

I've found that while detecting an unsigned integer overflow is pretty trivial, detecting a signed overflow in C/C++ is actually more difficult than most people think.

The most obvious, yet naive, way to do it would be something like:

int add(int lhs, int rhs)
{
 int sum = lhs + rhs;
 if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) {
  /* an overflow has occurred */
  abort();
 }
 return sum; 
}

The problem with this is that according to the C standard, signed integer overflow is undefined behavior. In other words, according to the standard, as soon as you even cause a signed overflow, your program is just as invalid as if you dereferenced a null pointer. So you can't cause undefined behavior, and then try to detect the overflow after the fact, as in the above post-condition check example.

Even though the above check is likely to work on many compilers, you can't count on it. In fact, because the C standard says signed integer overflow is undefined, some compilers (like GCC) will optimize away the above check when optimization flags are set, because the compiler assumes a signed overflow is impossible. This totally breaks the attempt to check for overflow.

So, another possible way to check for overflow would be:

int add(int lhs, int rhs)
{
 if (lhs >= 0 && rhs >= 0) {
  if (INT_MAX - lhs <= rhs) {
   /* overflow has occurred */
   abort();
  }
 }
 else if (lhs < 0 && rhs < 0) {
  if (lhs <= INT_MIN - rhs) {
   /* overflow has occurred */
   abort();
  }
 }

 return lhs + rhs;
}

This seems more promising, since we don't actually add the two integers together until we make sure in advance that performing such an add will not result in overflow. Thus, we don't cause any undefined behavior.

However, this solution is unfortunately a lot less efficient than the initial solution, since you have to perform a subtract operation just to test if your addition operation will work. And even if you don't care about this (small) performance hit, I'm still not entirely convinced this solution is adequate. The expression lhs <= INT_MIN - rhs seems exactly like the sort of expression the compiler might optimize away, thinking that signed overflow is impossible.

So is there a better solution here? Something that is guaranteed to 1) not cause undefined behavior, and 2) not provide the compiler with an opportunity to optimize away overflow checks? I was thinking there might be some way to do it by casting both operands to unsigned, and performing checks by rolling your own two's-complement arithmetic, but I'm not really sure how to do that.

Community
  • 1
  • 1
Channel72
  • 24,139
  • 32
  • 108
  • 180
  • 1
    Rather then attempting to detect, isn't it better pursuit to write code which doesn't have possibility of overflow? – Arun Oct 15 '10 at 17:35
  • Having the property of "Undefined Behavior" does not imply equivalency. In other words, dereferencing a NULL pointer is not equivalent to signed overflow. UB means the spec does not specify the outcome required to be compliant with the spec. So one cannot assume the system will be in an invalid and unstable state from signed overflow, just that the numeric result will not be predictable. – Amardeep AC9MF Oct 15 '10 at 17:36
  • 10
    @ArunSaha: It's really hard to take calculations and ensure that they will not overflow, and it's impossible to prove in the general case. The usual practice is to use as wide an integer type as possible and hope. – David Thornley Oct 15 '10 at 18:00
  • 6
    @Amardeep: Dereferencing a null pointer is equally undefined as signed overflow. Undefined behavior means that, as far as the Standard goes, anything can happen. One can't assume that the system won't be in an invalid and unstable state after signed overflow. The OP pointed out one consequence of this: it's perfectly legal for the optimizer to remove code that detects signed overflow once it happens. – David Thornley Oct 15 '10 at 18:02
  • @David: Yes, it is equally undefined. But that is not my point of contention. Dereferencing a NULL pointer may result in anything between nothing and a process crash. Signed overflow will result in an unpredictable value in a variable. If you know of an implementation that truly becomes invalid or unstable, I'd really be interested to know about it. – Amardeep AC9MF Oct 15 '10 at 18:23
  • 17
    @Amardeep: I mentioned such an implementation. GCC will *remove* the overflow check code when optimization flags are set. So it will basically break your program. This is arguably *worse* than a null pointer dereference, since it can result in subtle security flaws, whereas dereferencing a null will likely just bluntly clobber your program with a segfault. – Channel72 Oct 15 '10 at 18:36
  • @Amardeep: not necessarily. One possible result of the UB is that it recompiles your program in an environment where `int` is one bit larger and restarts it from the beginning. ;-) – R.. GitHub STOP HELPING ICE Oct 15 '10 at 19:19
  • @R: Nice... iterative architecture. :-) – Amardeep AC9MF Oct 15 '10 at 19:21
  • Downvoted for the abomination that is C/C++. – Puppy Oct 15 '10 at 20:07
  • @Amardeep: It would be legitimate for an implementation to trigger a hardware interrupt when a signed overflow occurs, and the standard imposes no requirement with respect to what such an interrupt might or might not do. – supercat Oct 15 '10 at 20:09
  • @supercat: I like R's hypothetical idea better. But if you know of such an implementation I'd like to study it. – Amardeep AC9MF Oct 15 '10 at 20:20
  • 2
    @Amardeep: I've certainly seem implementations where, depending upon compiler settings, overflow would cause a trap. It would be nice if languages allowed one to specify whether particular unsigned variables or quantities should (1) wrap cleanly, (2) fault, or (3) do whatever is convenient. Note that if a variable is smaller than a machine's register size, requiring that unsigned quantities wrap cleanly may prevent generation of optimal code. – supercat Oct 15 '10 at 20:52
  • 1
    @Amardeep: gcc is one such implementation, if you compile your code with `-ftrapv`. – Gilles 'SO- stop being evil' Oct 15 '10 at 21:05
  • @Gilles: I've read complaints that gcc's -ftrapv was unreliable in at least some versions; I have no idea if there are any versions where it actually works usefully. – supercat Sep 06 '16 at 20:40
  • `if (INT_MAX - lhs <= rhs) {` Shouldn't this be `if (INT_MAX - lhs < rhs) {` ? For `lhs = INT_MAX - 1` and `rhs = 1` we just add and it's `INT_MAX`, no overflow, but `INT_MAX - lhs = 1 == rhs`. – KamilCuk Jun 20 '21 at 18:10
  • This is no question and neither specific enough, because there are unsigned + signed integers and at least 4 basic arithmetic operations (addition, subtaction, multiplication, division). – Jay-Pi Dec 17 '21 at 22:57
  • Your question is about compiler_rt overflow routines, which are 5 routines in total for absolute value, negation, addition, subtraction and multiplication. – Jay-Pi Dec 27 '21 at 10:44

13 Answers13

45

No, your 2nd code isn't correct, but you are close: if you set

int half = INT_MAX/2;
int half1 = half + 1;

the result of an addition is INT_MAX. (INT_MAX is always an odd number). So this is valid input. But in your routine you will have INT_MAX - half == half1 and you would abort. A false positive.

This error can be repaired by putting < instead of <= in both checks.

But then also your code isn't optimal. The following would do:

int add(int lhs, int rhs)
{
 if (lhs >= 0) {
  if (INT_MAX - lhs < rhs) {
   /* would overflow */
   abort();
  }
 }
 else {
  if (rhs < INT_MIN - lhs) {
   /* would overflow */
   abort();
  }
 }
 return lhs + rhs;
}

To see that this is valid, you have to symbolically add lhs on both sides of the inequalities, and this gives you exactly the arithmetical conditions that your result is out of bounds.

Note in 2023: C23 will have the <stdckdint.h> header that implements such overflow checks in the same way as the gcc builtins that are mentioned in other answers.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • a small optimization you could do, I suppose this would depend on your hardware, I'm not sure which is better, but if you use an if and an else if with (lhs > 0 && rhs > 0) and (lhs < 0 && rhs < 0) this would allow you to skip making a subtraction in cases where the signs do not match or either value is 0, but it would require 4 comparisons in those cases and it would require an extra comparison in the case that both values are negative. Which are faster in the hardware? comparisons or arithmetic operations such as subtractions? – Dyskord May 02 '21 at 02:06
  • @Dyskord just write it in the way that is easiest to understand and let the compiler figure that out for you. – spectras Aug 09 '23 at 01:12
29

Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.

Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back

int sum(int a, int b)
{
    long long c;
    assert(LLONG_MAX>INT_MAX);
    c = (long long)a + b;
    if (c < INT_MIN || c > INT_MAX) abort();
    return c;
}

A good compiler should convert the entire addition and if statement into an int-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.

Edit: As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 1
    To anyone wanting to use this, be sure you're looking at my edited version. In the original I stupidly omitted the cast to `long long` before the addition. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 18:11
  • Then, by your definition gcc 4.4.4 is not a good compiler. I just tried this at `-O2` and `-O3` and got the same code, 12 lines of assembly (not counting assembler commands) where the actual addition is performed by `leaq (%rax,%rcx), %rcx`, or `addl %eax, %ecx`; `adcl %edx, %ebx` if the `-m32` flag is given. – Mike DeSimone Oct 15 '10 at 18:51
  • Also, gcc didn't remove the `if` statement from the generated code. – Mike DeSimone Oct 15 '10 at 18:52
  • @R, are you *certain* that the approach with subtraction is correct? Since signed-overflow is undefined, isn't it possible the compiler could see the expression `(INT_MAX - lhs <= rhs)` and optimize it away, thinking that such an expression must *always* be true? – Channel72 Oct 15 '10 at 19:04
  • Yes I am certain. There is no overflow in the expression `INT_MAX - lhs <= rhs` (assuming `lhs` is positive). Depending on the values of `lhs` and `rhs`, this expression can evaluate to 0 or 1. A compiler cannot optimize away the computation of expressions with nonconstant values. It could, of course, optimize this to an addition followed by an overflow check, if the machine supports such a thing. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 19:12
  • 5
    Out of curiosity, have you been successful in getting a compiler to do this optimization? A quick test against a few compilers didn't turn up any that could do it. – Stephen Canon Oct 15 '10 at 19:40
  • @Channel72: Also importantly, none of the subexpressions can overflow. If `lhs` is an `int` and nonnegative, `INT_MAX - lhs` yields a value between 0 and `INT_MAX`, inclusive, and that's perfectly legal, so nothing about this triggers undefined behavior. – David Thornley Oct 15 '10 at 20:23
  • What I can't understand is why someone who is worried about int32 overflows on the 64bit machine will continue to use int32? And using this in 32bit mode will be quite inneficient. – ruslik Oct 15 '10 at 20:37
  • 2
    On x86_64, there's nothing inefficient about using 32-bit integers. Performance is identical to 64-bit ones. One motivation for using smaller-than-native-word-size types is that it's extremely efficient to handle overflow conditions or carry (for arbitrary-precision arithmetic) since the overflow/carry happens in a directly-accessible location. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 20:57
  • @Stephen: Oddly I can't get it to work at the moment either, but I've seen plenty of cases where gcc optimizes out unnecessary 64-bit arithmetic like this. I wonder if it's too stupid to figure out how to do it for signed values; I normally only use unsigned arithmetic for things like this. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 21:00
  • 2
    @R., @Steven: no the subtraction code that the OP gave is not correct, please see my answer. I also give a code there that just does it with at most two comparisons. Perhaps the compilers will do better with that. – Jens Gustedt Oct 16 '10 at 06:46
  • @R.: after looking into it more closely it is also not wonder that `gcc` doesn't get away with just one jump on overflow. In fact, the conditions for overflow and underflow are in general not symmetric: on nowadays architectures usually `INT_MIN == (-INT_MAX) - 1`. So the two bound checks *must* be different. – Jens Gustedt Oct 16 '10 at 08:24
  • @R.: after thinking it over once more, I am not even convinced that your proposal of blowing it up to a wider type is worth it. Just use the corresponding unsigned type. – Jens Gustedt Oct 16 '10 at 08:36
  • @Jens: Which unsigned approach? There are lots and I'm not sure which is most efficient for handling signed inputs. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 15:47
  • @Jens: As for gcc, x86 has a perfectly good `jo` instruction (jump on overflow) it should be using. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 15:51
  • @R.. so with two choices **1)** using subtration and **2)** converting to a wider type, which would be more performant? – Pacerier Sep 22 '13 at 18:00
  • 4
    This approach does not work in the uncommon platform where `sizeof(long long) == sizeof(int)`. C specifies only that `sizeof(long long) >= sizeof(int)`. – chux - Reinstate Monica Aug 17 '14 at 17:53
20

For the gcc case, from gcc 5.0 Release notes we can see it now provides a __builtin_add_overflow for checking overflow in addition:

A new set of built-in functions for arithmetics with overflow checking has been added: __builtin_add_overflow, __builtin_sub_overflow and __builtin_mul_overflow and for compatibility with clang also other variants. These builtins have two integral arguments (which don't need to have the same type), the arguments are extended to infinite precision signed type, +, - or * is performed on those, and the result is stored in an integer variable pointed to by the last argument. If the stored value is equal to the infinite precision result, the built-in functions return false, otherwise true. The type of the integer variable that will hold the result can be different from the types of the first two arguments.

For example:

__builtin_add_overflow( rhs, lhs, &result )

We can see from the gcc document Built-in Functions to Perform Arithmetic with Overflow Checking that:

[...]these built-in functions have fully defined behavior for all argument values.

clang also provides a set of checked arithmetic builtins:

Clang provides a set of builtins that implement checked arithmetic for security critical applications in a manner that is fast and easily expressable in C.

in this case the builtin would be:

__builtin_sadd_overflow( rhs, lhs, &result )
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • This function appears to be *very* useful except for one thing: `int result; __builtin_add_overflow(INT_MAX, 1, &result);` does not explicitly say what is stored in `result` on overflow and unfortunately is quiet on specifying _undefined behavior_ does _not_ occur. Certainly that was the intent - no UB. Better if it specified that. – chux - Reinstate Monica Aug 31 '15 at 18:38
  • 1
    @chux good point, it states [here](https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html) the result is always defined, I updated my answer. It would be rather ironic if that was not the case. – Shafik Yaghmour Aug 31 '15 at 18:42
  • Interesting your new reference does not have a `(unsigned) long long *result` for `__builtin_(s/u)addll_overflow`. Certainly these are in error. Makes one wonder as to the veracity of other aspects. IAC, good to see these `__builtin_add/sub/mull_overflow()`. Hope they make it to the C spec someday. – chux - Reinstate Monica Aug 31 '15 at 19:40
  • 2
    +1 this generates _much_ better assembly than anything you could get in standard C, at least not without relying on your compiler's optimizer to figure out what you're doing. One should detect when such builtins are available and only use a standard solution when the compiler doesn't provide one. – Alex Reinking May 08 '19 at 10:41
18

The fastest possible way is to use the GCC builtin:

int add(int lhs, int rhs) {
    int sum;
    if (__builtin_add_overflow(lhs, rhs, &sum))
        abort();
    return sum;
}

On x86, GCC compiles this into:

    mov %edi, %eax
    add %esi, %eax
    jo call_abort 
    ret
call_abort:
    call abort

which uses the processor's built-in overflow detection.

If you're not OK with using GCC builtins, the next fastest way is to use bit operations on the sign bits. Signed overflow in addition occurs when:

  • the two operands have the same sign, and
  • the result has a different sign than the operands.

The sign bit of ~(lhs ^ rhs) is on iff the operands have the same sign, and the sign bit of lhs ^ sum is on iff the result has a different sign than the operands. So you can do the addition in unsigned form to avoid undefined behavior, and then use the sign bit of ~(lhs ^ rhs) & (lhs ^ sum):

int add(int lhs, int rhs) {
    unsigned sum = (unsigned) lhs + (unsigned) rhs;
    if ((~(lhs ^ rhs) & (lhs ^ sum)) & 0x80000000)
        abort();
    return (int) sum;
}

This compiles into:

    lea (%rsi,%rdi), %eax
    xor %edi, %esi
    not %esi
    xor %eax, %edi
    test %edi, %esi
    js call_abort
    ret
call_abort:
    call abort

which is quite a lot faster than casting to a 64-bit type on a 32-bit machine (with gcc):

    push %ebx
    mov 12(%esp), %ecx
    mov 8(%esp), %eax
    mov %ecx, %ebx
    sar $31, %ebx
    clt
    add %ecx, %eax
    adc %ebx, %edx
    mov %eax, %ecx
    add $-2147483648, %ecx
    mov %edx, %ebx
    adc $0, %ebx
    cmp $0, %ebx
    ja call_abort
    pop %ebx
    ret
call_abort:
    call abort
tbodt
  • 16,609
  • 6
  • 58
  • 83
16

IMHO, the eastiest way to deal with overflow sentsitive C++ code is to use SafeInt<T>. This is a cross platform C++ template hosted on code plex which provides the safety guarantees that you desire here.

I find it very intuitive to use as it provides the many of the same usage patterns as normal numerical opertations and expresses over and under flows via exceptions.

Human-Compiler
  • 11,022
  • 1
  • 32
  • 59
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
10

If you use inline assembler you can check the overflow flag. Another possibility is taht you can use a safeint datatype. I recommend that read this paper on Integer Security.

Community
  • 1
  • 1
rook
  • 66,304
  • 38
  • 162
  • 239
  • 6
    +1 This is another way of saying "If C won't define it, then you're forced into platform-specific behavior." So many things that are easily taken care of in assembly are undefined in C, creating mountains out of molehills in the name of portability. – Mike DeSimone Oct 15 '10 at 17:39
  • Signed overflow if perfectly detectable and avoidable in plain C. -1 for suggesting asm instead of suggesting C code that will generate the correct asm on a good compiler. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 18:00
  • @R I have to agree with Rook. – wheaties Oct 15 '10 at 18:17
  • @R. perfectly detectable yes, however for integers operations one needs to care about performance. Many libraries are quite intensive in their integer usage (compression / decompression, encoding / decoding), it would be great to be able to use safe integer there too. – Matthieu M. Oct 15 '10 at 18:28
  • 6
    I gave a downvote for an asm answer to a C question. As I have said, there are correct, portable ways to write the check in C which will generate the exact same asm that you would write by hand. Naturally if you use these, the performance impact will be the same, and it will be much less of an impact than the C++ safeint stuff you also recommended. – R.. GitHub STOP HELPING ICE Oct 15 '10 at 19:22
  • 1
    @Matthieu: If you are writing code that will only be used on one implementation, and that implementation guarantees that something will work, and you need good integer performance, you can certainly use implementation-specific tricks. That isn't what the OP was asking for, though. – David Thornley Oct 15 '10 at 20:25
  • Agreed with rook. I guess overflow flags are present in many platforms. – Amir Zadeh Oct 15 '10 at 20:29
  • 3
    C distinguishes implementation-defined behavior and undefined behavior for good reasons, and even if something with UB "works" in **the current version** of your implementation, that doesn't mean it will continue to work in future versions. Consider gcc and signed overflow behavior... – R.. GitHub STOP HELPING ICE Oct 15 '10 at 21:04
  • 2
    Since I based my -1 on a claim that we could get C code to generate the identical asm, I guess it's only fair to retract it when all the major compilers turn out to be junk in this regard.. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 00:29
  • Defining "did last operation overflow" flag behavior can be very expensive, since optimal code generation often requires rearranging computations. If the expression `x+y` is evaluated within a loop but neither `x` nor `y` changes within that loop, it may be helpful for a compiler to evaluate that expression just once--before the loop starts. Such reordering could totally disrupt any code that tries to examine the overflow flag, however. – supercat Jun 06 '17 at 21:34
3

The obvious solution is to convert to unsigned, to get the well-defined unsigned overflow behavior:

int add(int lhs, int rhs) 
{ 
   int sum = (unsigned)lhs + (unsigned)rhs; 
   if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) { 
      /* an overflow has occurred */ 
      abort(); 
   } 
   return sum;  
} 

This replaces the undefined signed overflow behavior with the implementation-defined conversion of out-of-range values between signed and unsigned, so you need to check your compiler's documentation to know exactly what will happen, but it should at least be well defined, and should do the right thing on any twos-complement machine that doesn't raise signals on conversions, which is pretty much every machine and C compiler built in the last 20 years.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • You're still storing the result in `sum`, which is an `int`. That results in either an implementation-defined result or an implementation-defined signal being raised if the value of `(unsigned)lhs + (unsigned)rhs` is greater than `INT_MAX`. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 04:56
  • 3
    @R: that's the whole point -- the behavior is implementation defined, rather than undefined, so the implementation must document what it does, and do it consistently. A signal can only be raised if the implementation documents it, in which case it must always be raised and you can use that behavior. – Chris Dodd Oct 16 '10 at 06:18
2

Your fundamental problem is that lhs + rhs doesn't do the right thing. But if you're willing to assume a two's complement machine, we can fix that. Suppose you have a function to_int_modular that converts unsigned to int in a way that is guaranteed to be the inverse of conversion from int to unsigned, and it optimizes away to nothing at run time. (See below for how to implement it.)

If you use it to fix the undefined behavior in your original attempt, and also rewrite the conditional to avoid the redundant test of lhs >= 0 and lhs < 0, then you get

int add(int lhs, int rhs)
{
 int sum = to_int_modular((unsigned)lhs + rhs);
 if (lhs >= 0) {
  if (sum < rhs)
    abort();
 } else {
  if (sum > rhs)
   abort();
 }
 return sum; 
}

which should outperform the current top-voted answer, since it has a similar structure but requires fewer arithmetic operations.

(Reorganizing the if shouldn't be necessary, but in tests on godbolt, ICC and MSVC do eliminate the redundant test on their own, but GCC and Clang surprisingly don't.)

If you prefer to compute the result in a wider size and then bounds check, one way to do the bounds check is

 long long sum = (long long)lhs + rhs;
 if ((int)sum != sum)
  abort();

... except that the behavior is undefined on overflow. But you can fix that with the same helper function:

 if (to_int_modular(sum) != sum)

This will probably outperform the current accepted answer on compilers that aren't smart enough to optimize it to a test of the overflow flag.

Unfortunately, testing (visual inspection on godbolt) suggests that GCC, ICC and MSVC do better with the code above than with the code in the accepted answer, but Clang does better with the code in the accepted answer. As usual, nothing is easy.


This approach can only work on architectures where the ranges of int and unsigned are equally large, and the specific implementations below also depend on its being two's complement. Machines not meeting those specs are vanishingly rare, but I'll check for them anyway:

static_assert(INT_MIN + INT_MAX == -1 && UINT_MAX + INT_MIN == INT_MAX);

One way to implement to_int_modular is

inline int to_int_modular(unsigned u) {
    int i;
    memcpy(&i, &u, sizeof(i));
    return i;
}

All major x64 compilers have no trouble optimizing that to nothing, but when optimizations are disabled, MSVC and ICC generate a call to memcpy, which may be a bit slow if you use this function a lot. This implementation also depends on details of the representation of unsigned and int that probably aren't guaranteed by the standard.

Another way is this:

inline int to_int_modular(unsigned u) {
    return u <= INT_MAX ? (int)u : (int)(u - INT_MIN) + INT_MIN;
}

All major x64 compilers optimize that to nothing except ICC, which makes an utter mess of it and every variation that I could think of. ICX does fine, and it appears that Intel is abandoning ICC and moving to ICX, so maybe this problem will fix itself.

benrg
  • 1,395
  • 11
  • 13
  • You can add that with C2X making signed integer overflow defined (as all produced architectures now work on 2s complement), this can be simplified to the approach of Hacker's Delight: `var sum: ST = a +% b;` (+% being wrapping addition). `if (((sum ^ a) & (sum ^ b)) < 0) overflow(); return sum;` – Jay-Pi Dec 30 '21 at 16:42
1

You may have better luck converting to 64-bit integers and testing similar conditions like that. For example:

#include <stdint.h>

...

int64_t sum = (int64_t)lhs + (int64_t)rhs;
if (sum < INT_MIN || sum > INT_MAX) {
    // Overflow occurred!
}
else {
    return sum;
}

You may want to take a closer look at how sign extension will work here, but I think it is correct.

Jonathan
  • 13,354
  • 4
  • 36
  • 32
  • 1
    Note that this won't work in [ILP64](https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models) environments where `int`s are 64-bit. – rtx13 Apr 30 '20 at 17:22
1

How about:

int sum(int n1, int n2)
{
  int result;
  if (n1 >= 0)
  {
    result = (n1 - INT_MAX)+n2; /* Can't overflow */
    if (result > 0) return INT_MAX; else return (result + INT_MAX);
  }
  else
  {
    result = (n1 - INT_MIN)+n2; /* Can't overflow */
    if (0 > result) return INT_MIN; else return (result + INT_MIN);
  }
}

I think that should work for any legitimate INT_MIN and INT_MAX (symmetrical or not); the function as shown clips, but it should be obvious how to get other behaviors).

SamB
  • 9,039
  • 5
  • 49
  • 56
supercat
  • 77,689
  • 9
  • 166
  • 211
  • +1 for nice alternate approach that's perhaps more intuitive. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 07:14
  • 1
    I think this - `result = (n1 - INT_MAX)+n2;` - could overflow, if n1 was small (say 0) and n2 was negative. – davmac Aug 05 '13 at 14:42
  • @davmac: Hmm... maybe it's necessary to break out three cases: start with one for `(n1 ^ n2) < 0`, which on a two's-complement machine would imply that the values have opposite sign and may be added directly. If the values have the same sign, then the approach given above would be safe. On the other hand, I'm curious if the authors of the Standard expected that implementations for two's-complement silent-overflow hardware would jump the rails in case of overflow in a way that didn't force an immediate Abnormal Program Termination, but caused unpredictable disruption of other computations. – supercat Sep 06 '16 at 20:51
0

In case of adding two long values, portable code can split the long value into low and high int parts (or into short parts in case long has the same size as int):

static_assert(sizeof(long) == 2*sizeof(int), "");
long a, b;
int ai[2] = {int(a), int(a >> (8*sizeof(int)))};
int bi[2] = {int(b), int(b >> (8*sizeof(int))});
... use the 'long' type to add the elements of 'ai' and 'bi'

Using inline assembly is the fastest way if targeting a particular CPU:

long a, b;
bool overflow;
#ifdef __amd64__
    asm (
        "addq %2, %0; seto %1"
        : "+r" (a), "=ro" (overflow)
        : "ro" (b)
    );
#else
    #error "unsupported CPU"
#endif
if(overflow) ...
// The result is stored in variable 'a'
atomsymbol
  • 370
  • 8
  • 11
-1

By me, the simpliest check would be checking the signs of the operands and of the results.

Let's examine sum: the overflow could occur in both directions, + or -, only when both operands have the same sign. And, obviosly, the overflow will be when the sign of the result won't be the same as the sign of the operands.

So, a check like this will be enough:

int a, b, sum;
sum = a + b;
if  (((a ^ ~b) & (a ^ sum)) & 0x80000000)
    detect_oveflow();

Edit: as Nils suggested, this is the correct if condition:

((((unsigned int)a ^ ~(unsigned int)b) & ((unsigned int)a ^ (unsigned int)sum)) & 0x80000000)

And since when the instruction

add eax, ebx 

leads to undefined behavior? There is no such thing in the Intel x86 instruction set refference..

ruslik
  • 14,714
  • 1
  • 39
  • 40
  • 2
    You're missing the point here. Your second line of code `sum = a + b` could yield undefined behavior. – Channel72 Oct 15 '10 at 21:12
  • if you cast sum, a and b to unsigned during your test-addition your code will work btw.. – Nils Pipenbrinck Oct 15 '10 at 21:27
  • It's undefined not because the program will crash or will behave differently. It's the exact thing the processor is doing to compute the OF flag. The standard is just trying to protect itself from non-standard cases, but it doesn't mean that you are not allowed to do this. – ruslik Oct 15 '10 at 21:30
  • @Nils yeah, i wanted to do that, but I thought that four `(usngined int)`s will make it much more unreadable. (you know, you first read it, and try it only if you liked it). – ruslik Oct 15 '10 at 21:31
  • Well you could make it more readable by taking out the useless `int` keywords. `unsigned int` is just an overly verbose way of writing `unsigned`. – R.. GitHub STOP HELPING ICE Oct 16 '10 at 04:55
  • @ruslik, it's undefined which specifically means the compiler can _assume it won't happen_. In this case the compiler can assume that the addition won't overflow, and a really smart compiler could then recognise that the condition in the 'if' statement must then always be false and so completely remove the check; you're back to the original problem. – davmac Aug 05 '13 at 14:39
  • 1
    the undefined behavior is in C, not after compiling to assembly – phuclv Jul 02 '15 at 16:39
-4

I think that this works:

int add(int lhs, int rhs) {
   volatile int sum = lhs + rhs;
   if (lhs != (sum - rhs) ) {
       /* overflow */
       //errno = ERANGE;
       abort();
   }
   return sum;
}

Using volatile keeps the compiler from optimizing away the test because it thinks that sum may have changed between the addition and the subtraction.

Using gcc 4.4.3 for x86_64 the assembly for this code does do the addition, the subtraction, and the test, though it stores everything on the stack and of unneeded stack operations. I even tried register volatile int sum = but the assembly was the same.

For a version with only int sum = (no volatile or register) the function did not do the test and did the addition using only one lea instruction (lea is Load Effective Address and is often used to do addition without touching the flags register).

Your version is larger code and has a lot more jumps, but I don't know which would be better.

nategoose
  • 12,054
  • 27
  • 42
  • 6
    -1 for misuse of `volatile` to mask undefined behavior. If it "works", you're still just "getting lucky". – R.. GitHub STOP HELPING ICE Oct 16 '10 at 00:30
  • @R: If it doesn't work the compiler isn't implementing `volatile` correctly. All I was trying for was a simpler solution to a very common problem on an already answered question. – nategoose Oct 18 '10 at 14:39
  • Where it might fail, though, would be a system whose numeric representation wrap to lower values upon overflow for integers. – nategoose Oct 18 '10 at 14:49
  • That last comment should have a "didn't" or "doesn't" in it. – nategoose Oct 18 '10 at 21:04
  • 2
    @nategoose, your assertion that "if it doesn't work the compiler isn't implementing volatile correctly" is wrong. For one thing, in two's complement arithmetic, it will always be true that lhs = sum - rhs even if overflow occurred. Even if that were not the case, and although this particular example is a bit contrived, the compiler might for instance generate code which performs the addition, stores the result value, reads the value back into another register, compares the stored value with the read value and notices that they are the same and therefore assume no overflow has occurred. – davmac Aug 05 '13 at 13:27
  • 2
    (you're also assuming that causing the overflow won't cause the comparison afterwards to go wrong or even be skipped, which "undefined behavior" allows). – davmac Aug 05 '13 at 13:55