3

In C++ an overflow of signed types is undefined behavior. Is the following example an undefined behavior as well?

#include <limits.h>

int f() {
  int a = INT_MAX;
  int b = -1;
  return a + b;
}

It is not an overflow in math context, but a CPU will see it probably like add 0x7fffffff 0xffffffff.

Stargateur
  • 24,473
  • 8
  • 65
  • 91
Paweł Bylica
  • 3,780
  • 1
  • 31
  • 44
  • The code you posted is not C++, there is no ".h" in C++, tag correctly – josemartindev Dec 13 '17 at 14:10
  • This works and does not have any errors? – Jake Freeman Dec 13 '17 at 14:10
  • 14
    You can't talk about undefined behavior in the *language* sense and conflate it with a CPU's view. No, it's not undefined behavior since there is no overflow. – StoryTeller - Unslander Monica Dec 13 '17 at 14:11
  • @joemartin94 Actually, older versions of C++ utlizes .h files. – klutt Dec 13 '17 at 14:13
  • @klutt Personally I have never seen one – josemartindev Dec 13 '17 at 14:14
  • 2
    @joemartin94 Ever heard of turbo? – Passer By Dec 13 '17 at 14:14
  • @joemartin94 - Then your development career must have been a very sheltered existence – StoryTeller - Unslander Monica Dec 13 '17 at 14:14
  • 1
    @joemartin94, The [deprecated C headers](http://en.cppreference.com/w/cpp/header) still use .h. They're still perfectly standard. – chris Dec 13 '17 at 14:15
  • 6
    2,147,483,647 + (-1) = 2,147,483,646, just like 5 + (-1) = 4. – Retired Ninja Dec 13 '17 at 14:16
  • 2
    @joemartin94 However, you're completely right that OP should pick one. – klutt Dec 13 '17 at 14:16
  • why do you think there is an overflow / ub ? – 463035818_is_not_an_ai Dec 13 '17 at 14:18
  • 6
    I just don't understand this question, why would this be UB. You could even ask, whether `-1 + -1` is UB? It is `0xffffffff + 0xffffffff` after all. Of course it is not UB. – geza Dec 13 '17 at 14:21
  • @tobi303 I don't think the OP _think[s] there is an overflow_; they're just checking/confirming there isn't, and kudos to them for checking rather than assuming one way or the other. – TripeHound Dec 13 '17 at 14:46
  • @TripeHound dont want to say that the question is non-sense. Completely agree that double checking is better than blind faith. Nevertheless imho the question would be nicer if OP would more clearly state why they are worrying. Read my last comment as "why are you worried that there might be an overflow?" – 463035818_is_not_an_ai Dec 13 '17 at 14:50
  • C is not C++ is not C, please don't tag both language for no reason if you want the answer for C and C++, made two questions. Let expert choose if your question need to be tagged for more than one language. I choose to let C++ tag because in C `int f();` is not a correct declaration of function. (`int f(void);`, this is of course arbitrary feel free to only tag C if you want) – Stargateur Dec 13 '17 at 15:02
  • @tobi303 I guess because of the juxtaposition of `INT_MAX` and `+`... there's no sane way that adding a negative _should_ give UB, but then – while many rules of UB make sense – for _some_ of the things defined as UB to _not_ reliably give what "common sense" would say is the "obviously correct" answer would seem to demand deliberate perversity on the part of a compiler-writer or chip-maker. – TripeHound Dec 13 '17 at 15:11
  • @Stargateur -- In C11 [empty parenthesis are fine for function declarators that are part of the function definition](http://port70.net/~nsz/c/c11/n1570.html#6.7.6.3p14), as is the case in OP code. – ad absurdum Dec 13 '17 at 15:12
  • @Stargateur And while conflating C/C++ would be wrong/unhelpful on many occasions, for straightforward questions where the answers are simple/short (e.g. not much more than a simple _yes/no_), I'd rather have one question with answer(s) covering both C and C++ (and standards-variants of each, if applicable) rather than hunt around multiple questions. – TripeHound Dec 13 '17 at 15:16
  • @DavidBowling This is why I said "declaration" I didn't use OP code for this reason, sorry it was arbitrary like I said. – Stargateur Dec 13 '17 at 15:19
  • @TripeHound On contrary this kind of question can easily produce VERY long answer, https://stackoverflow.com/a/18721336/7076153. Whatever, it's not to the OP to know that the answer will be short enough to cover C and C++ and it's not to the OP to know if the answer is the same for C and C++. Please to do increase the number of question too broad to be answer. C and C++ are now really different in a lot of behavior such as overflow and in code practice. So, no there is no good reason to tag this question with C and C++. Made two different questions if the languages are not the same. – Stargateur Dec 13 '17 at 15:23
  • @Stargateur I think we'll have to agree to disagree on this one – to me that's almost a perfect case for combining the two tags. The top two answers give fairly succinct answers to the level most people will need; one for C++, one for C. The third answer (the one you linked) goes into all the gory details for all variants of both languages, for the language-lawyers or those just curious. _To me_, it seems preferable to have all that in one place rather than spread across two or more questions. (I **do** accept that using both tags _can_ often be wrong; just not in these two instances). – TripeHound Dec 13 '17 at 15:47
  • @TripeHound Don't worry, I accept that we disagree. But consider this exemple, made two questions, one tag with C, one tag with C++, and made one question duplicate of the other, in this way, we don't need to answer two times and if in the future answer of the duplicate question is not anymore correct for the question, we break the duplicate and answer the question without problem. With your solution, we will have a problem because all answer will become outdated and updated them with be hard cause the question will be too broad [perfect answer](https://meta.stackoverflow.com/a/358599/7076153) – Stargateur Dec 13 '17 at 16:02
  • @Stargateur -- that is the best reason to avoid combining C and C++ tags unless explicitly comparing the two, in my opinion. Answers which are the same for C and C++ today may not be the same tomorrow. – ad absurdum Dec 13 '17 at 16:15

3 Answers3

9

The example you give is not an overflow.

From Wikipedia (https://en.wikipedia.org/wiki/Integer_overflow):

... an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of bits – either larger than the maximum or lower than the minimum representable value.

INT_MAX + (-1) is not outside of the range representable by the int type, and the result is defined.

payne
  • 13,833
  • 5
  • 42
  • 49
7

Since the result of INT_MAX + (-1) is within the range of representable values of int, it is well defined

You should stop viewing the language as a thin layer over assembly or machine code.

In regards to undefined-ness of a program, there is no CPU, there is only the abstract machine on which the program runs.

Passer By
  • 19,325
  • 6
  • 49
  • 96
1

On two's-complement systems, carry and overflow are distinct concepts. Many such systems are designed to support multi-word arithmetic. A carry will be reported if the arithmetic sum, interpreting the values as unsigned, would exceed the range of the type. An overflow will be reported after any computation where the carry into the upper bit differs from the carry out, but will only be meaningful following the computation of the upper byte.

On an 8-bit system, for example, "int" would typically be two bytes, INT_MAX would be 0x7F:0xFF and -1 would be 0xFF:FF. Addition is performed by adding the two lower bytes, and then adding the two upper bytes with a carry from the lower.

  • Adding 0xFF to 0xFF yields 0xFE and carry but no overflow (carry both into and out of the upper bit).

  • Adding 0x7F to 0xFF yields 0x7E with no carry nor overflow (carry neither into nor out of the upper bit).

Such details typically shouldn't matter to a C programmer, since compilers typically provide no way to access the overflow flag, nor any way of ensuring that calculations are performed in a way that would make the flag meaningful at any particular point in the code. Nonetheless, the C89 rationale notes:

C code can be non-portable.
Although it strove to give programmers the opportunity to write truly portable programs, the C89 Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler”: the ability to write machine- specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program (§4).

Note, btw, that the rationale for making small unsigned types promote to signed int strongly implies that authors of the C89 expected that something like the following should be safe on commonplace platforms:

unsigned mul_mod_65536(unsigned short x, unsigned short y)
{ return (x*y) & 0xFFFF; }

Nonetheless, such code may malfunction when processed on commonplace 32-bit platforms when processed by excessively-"clever" optimizing compilers like gcc. There is no reason that machine code generated for such a function should care about whether the value of x*y exceeds INT_MAX, since the upper bits of the result are going to get chopped off and the lower bits would be correct in any case. In some cases, however, if gcc knows e.g. that y will be 65535, it may decide that since x*y "can't" exceed INT_MAX, x "can't" exceed 0x8000. The authors of the Standard may not have wanted to preclude the possibility of C being used as a high-level assembler, but that doesn't meant that compiler writers share such feeling.

supercat
  • 77,689
  • 9
  • 166
  • 211