6

I came to know through this answer that:

Signed overflow due to computation is still undefined behavior in C++20 while Signed overflow due to conversion is well defined in C++20(which was implementation defined for Pre-C++20).

And this change in the signed overflow due to conversion is because that from C++20 compilers are required use 2's complement.

My question is:

If compilers are required to use 2's complement from C++20, then why isn't signed overflow due to computation well-defined just like for signed overflow due to conversion?

That is, why(how) is there a difference between overflow due to computation and overflow due to conversion. Essentially, why these two kinds of overflows treated differently.

Jason
  • 36,170
  • 5
  • 26
  • 60
  • 5
    Some CPUs generate a CPU exception on overflow - example: MIPS – Martin Rosenau Jan 21 '22 at 12:44
  • 4
    It allows some optimizations, such as `a - 1 < b - 1` <=> `a < b`. – Jarod42 Jan 21 '22 at 12:51
  • @MartinRosenau Could you post supporting references? I try to find some like in [this question](https://stackoverflow.com/questions/23234189/arithmetic-overflow-in-mips), but the provided link has expired. – o_oTurtle Dec 24 '22 at 06:08
  • @o_oTurtle You might use the [MIPS R4400 CPU manual](https://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_book_Ed2.pdf): You may compare the `ADD` instruction (page A-11) to the `ADDU` instruction (page A-14). – Martin Rosenau Dec 29 '22 at 16:06
  • The situations are different. Signed integer arithmetic overflow is considered programming defect by vast majority. So we have to fix such defect by programming it differently, regardless of 2's complement or whatever. Standard defined behaviour on overflow when converting to signed integer is useful, as conversion from unsigned char to char or std::byte to char can be needed in any project and extra code for it feels pointless. – Öö Tiib Feb 20 '23 at 11:00

1 Answers1

6

If non-two's-complement support had been the only concern, then signed arithmetic overflow could have been defined as having implementation defined result, just like converting an integer has been defined. There are reasons why it is UB instead, and those reasons haven't changed, nor have the rules of signed arithmetic overflow changed.

In case of any UB, there are essentially two primary reasons for it to exist:

  • Portability. Different systems behave in different ways and UB allows supporting all systems in an optimal way. In this case as Martin Rosenau mentions in a comment, there are systems that don't simply produce a "wrong" value.
  • Optimisation. UB allows a compiler to assume that it doesn't happen, which allows for optimisations based on that assumption. Jarod42 shows an example in a comment. Another example is that with UB overflow, it is possible to deduce that adding two positive numbers never produces a negative number, nor a number that is smaller than either of the positive numbers.
eerorika
  • 232,697
  • 12
  • 197
  • 326
  • Sort of related to your first point, a reasonably common reason is that there are platforms that have a behaviour that cannot be (easily) worked around. In the case of integer overflow, there are CPU instruction sets in which signed integer overflow results in a hardware exception/interrupt, and working around that (e.g. checking EVERY operation involving integers before doing them, trapping and clearing hardware exceptions) would be expensive (e.g. lots of additional instructions and performance hit). – Peter Jan 21 '22 at 13:14
  • **For your second point:** UB is frequently used for Optimization, this I agree, but AFAIK the standard never provides UB _for_ optimization; rather, its the other way around: optimizers use UB _because it is already there_. The first point is, I believe, the _only_ technical reason. There are plenty cases of the C++ standard relaxing UB restrictions in newer standards, even though before the compiler may have used these for optimizations. UB is often a result of the ability to support it in the C++ abstract machine being too complicated at the time (or, sometimes oversight) – Human-Compiler Jan 21 '22 at 14:59
  • One of the most important benefits of signed `int` overflow being UB is that indexing arrays with an `int` loop counter can still optimize that to a pointer increment or whatever, without extra checks that the loop will definitely terminate without wrapping. See [Is there some meaningful statistical data to justify keeping signed integer arithmetic overflow undefined?](https://stackoverflow.com/q/56047702) and http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html – Peter Cordes Oct 18 '22 at 10:20
  • The Undefined Behavior coin has two faces. The one most referred to is "optimizations". However, the other face of the coin is "diagnostics", often disregarded, but equally important. Without UB, how would the compiler know when we're doing it wrong? UB is our friend. In fact, the "portability" mentioned in this answer is in fact about diagnostics; that trap is there for a good reason. – alx - recommends codidact Apr 07 '23 at 20:34