21

Why do C and C++ not provide a set of implementation provided operations to perform each of the basic integer operations with overflow checking provided (e.g. a bool safeAdd(int *out, int a, int b)).

As I understand it, most instruction sets have ways to tell if the operations overflowed (e.g. x86 overflow and carry flags) and also define would happens in the case of signed integers.

As such, should compilers not be capable of doing a far better job, creating simpler and faster operations, than what it is possible to code in C and C++?

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Fire Lancer
  • 29,364
  • 31
  • 116
  • 182
  • 2
    I don't think adding checks you will get you "faster operations". – Alexander Chertov Aug 22 '12 at 10:58
  • 3
    "faster" than the trying to work out before the operation if it might overflow or not. e.g. IIRC x86 signed addition you can just check the OF flag with a conditional jump – Fire Lancer Aug 22 '12 at 11:00
  • 3
    btw, there is boost::numeric_cast for safe numeric conversions. – mirt Aug 22 '12 at 11:01
  • 1
    I know "how to do it", but is an optimising compiler going to take those sort of checks and turn your "if(additionIsSafe(inta,intb))" into a "ADD a, b \n JO overflow" ? – Fire Lancer Aug 22 '12 at 11:03
  • 1
    It's a philosophical question why it is not in the language, but here are answers on how to do it anyway: http://stackoverflow.com/questions/199333/best-way-to-detect-integer-overflow-in-c-c. – Lubo Antonov Aug 22 '12 at 11:05
  • 1
    Take a look at [this stuff](http://ptgmedia.pearsoncmg.com/images/0321335724/samplechapter/seacord_ch05.pdf) for some discussion about overflow handling. It does not mention _why_ it wasn't considered in C, but simply states that the standard allows for integer arithmetic to be modular, and so an "overflow" in that situation is considered to be desired and expected behavior. Bit daft, but also a bit late to do anything about it now. – Rook Aug 22 '12 at 11:06
  • @FireLancer In the case of x86, there is (or used to be) a special instruction (`INTO`) just for this. Intel's announced intention was that compiles use it. As far as I know, none ever did (including Intel's own compilers); the original PL/M compiler I used on 8086 had an option to use it, but it wasn't actually implemented. – James Kanze Aug 22 '12 at 11:08
  • @FireLancer A good compiler could probably hoist a lot of the checks up to a higher level, in the same way a good programmer does. The difference being that the good compiler would not forget to correct the hoisted checks when the expressions at the lower levels changed. – James Kanze Aug 22 '12 at 11:10
  • 1
    If there was a check for overflow, what should the code do when it happens? Throwing an exception is out: exceptions are thrown explicitly, and that's a critical notion for reasoning about correctness. Abort? Really? – Pete Becker Aug 22 '12 at 12:01
  • I wasn't thinking a general check all thing, more a set of explicit functions that return true if everything was fine, or returns false if not and leave the "*out" value alone – Fire Lancer Aug 22 '12 at 12:16
  • @PeteBecker That's the nice thing about undefined behavior. The implementor can do what ever is best for his customers. (In most cases, I'd go with aborting. If it gets to that point, there's an error in the code upstream, and you don't know what all else is wrong.) – James Kanze Aug 22 '12 at 12:27

6 Answers6

10

C and C++ follow a central tenet of "You don't pay for what you don't need". So the default arithmetic operations aren't going to stray from the underlying architecture's single instruction for arithmetic operations.

As to why there isn't a standard library function for adding two integers and detecting overflow, I can't say. First of all, it appears the language defines signed integer overflow as undefined behavior:

In the C programming language, signed integer overflow causes undefined behavior,

Considering that there are multiple ways to implement signed integer (one's complement, two's complement, etc) and when C was created, these architectures were all prevalent, its understandable why this is undefined. It would be hard to implement a "safe*" pure C function without lots of information about the underlying platform. It could be done knowing on a CPU-by-CPU basis.

Still that doesn't make it impossible. I'd definitely be interested if someone could find proposals to the C or C++ standards bodies with safer overflow helpers and be able to see why they were rejected.

Regardless, there are many ways in practice to detect arithmetic overflows and libraries to help.

Community
  • 1
  • 1
Doug T.
  • 64,223
  • 27
  • 138
  • 202
6

Probably because there is no demand for it. Arithmetic overflow is undefined behavior, expressedly to allow implementations to do such checks. If compiler vendors thought that doing them would sell more compilers, they would.

In practice, it would be very, very difficult for a compiler to do them more effectively than the programmer can. It's pretty much standard procedure to validate the ranges of all numeric input, to ranges where you can prove that later operations cannot overflow. All good programmers do this as a matter of habit. So this means one quick if immediately after input, and no further checking.

Still, programmers have been known to make mistakes, and it's simple to forget to correct the validation when you change the calculations later. I'd like to see such a feature in a compiler. But apparently, it won't help sell compilers, or at least the vendors believe that it won't, so we don't get it.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • I wonder what (if any) proposals have ever been made to C or C++'s standards committees for safer arithmetic functions/libraries. – Doug T. Aug 22 '12 at 11:19
  • 2
    @DougT. - nothing like that in general, in part because it's not clear that there's something "safer". Atomics do guarantee that overflow and underflow of signed values wrap as if the representation was 2s-complement. – Pete Becker Aug 22 '12 at 11:55
  • There's a dynamic [Integer Overflow Checker](http://embed.cs.utah.edu/ioc/) for C and C++ code. – caf Aug 22 '12 at 13:19
  • @caf: I know that Richard Smith is planning to implement in Clang a mode where as much instances of undefined behavior as possible will abort the program and that integer overflow is part of it (Clang already has some specific modes but nothing comprehensive yet). I think that static analyzer could potentially catch such overflows... but the programs are probably not annotated enough. – Matthieu M. Aug 22 '12 at 15:09
  • @MatthieuM. That's something I've wanted for a long time. (Another thing would be a virtual machine which simulates very strange environments, to test for accidental dependencies on implementation defined things. A virtual machine which would allow 9 bit bytes, trapping representations in int, etc., etc.) – James Kanze Aug 22 '12 at 15:54
5

The question pops up regularly.

First, remember than C is defined to be portable and efficient. As such, it was designed to only provide operations that were supported by a lot of hardware (probably before x86 even saw the light of day).

Second, a number of compiler provide (or plan to provide) built-ins for such operations, so that users may use class-types that use those built-ins under the hood. The quality of the implementations of the built-ins is not as important (though it is) than the fact that a compiler aware of their meaning may optimize the checks out when they are provably useless.

Finally, there are other ways to actually check programs. For example, static analysis or special compilations modes & unit tests may detect those flaws early and avoid the need (more or less completely) to embed those overflow checks in Release builds.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
4

A better question might be: why is integer overflow undefined behavior? In practice, 99.9% of all CPUs use two's complement and a carry/overflow bit. So in the real world, on an assembler/opcode level, integer overflows are always well-defined. In fact a whole lot of assembler, or hardware-related C, relies heavily on well-defined integer overflows (drivers for timer hardware in particular).

The original C language, before standardization, probably didn't consider things like this in detail. But when C got standardized by ANSI and ISO, they had to follow certain standardization rules. ISO standards aren't allowed to be biased towards a certain technology and thereby give a certain company advantages in competition.

So they had to consider that some CPUs may possible implement obscure things like one's complement, "sign and magnitude" or "some implementation-defined manner". They had to allowed signed zeroes, padding bits and other obscure signed integer mechanisms.

Because of it, the behavior of signed numbers turned wonderfully fuzzy. You can't tell what happens when a signed integer in C overflows, because signed integers may be expressed in two's complement, one's complement, or possibly some other implementation-defined madness. Therefore integer overflows are undefined behavior.

The sane solution to this problem wouldn't be to invent some safe range checks, but rather to state that all signed integers in the C language shall have two's complement format, end of story. Then an unsigned char would always be 0 to 127 and overflow to -128 and everything would be well-defined. But artificial standard bureaucracy prevents the standard from turning sane.

There are many issues like this in the C standard. Alignment/padding, endianess etc.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • The problem I came across there, is even if the target machine is twos-complimant, if you write a check such that you perform say a signed addition then check to see if the value wrapped around, the compiler can remove it as an impossible code branch (gcc for example IIRC), and regardless of implementation, a "bool addWithOverflowCheck(int*,int,int)" or whatever you want to call it seems something that can be done more optimally by the compiler on nearly any platform – Fire Lancer Aug 22 '12 at 12:24
  • The reason why integer overflow is undefined is so that implementors can define it, in whatever way is best for their customers. The intent is that some implementors do implement checks. – James Kanze Aug 22 '12 at 12:29
  • `hardware-related C, relies heavily on well-defined integer overflows` Does it mean they compile with all optimization turned off? LLVM, for example, [throws out the code that overflows](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html) – Cubbi Aug 22 '12 at 12:48
  • @Cubbi No it most likely means that they directly access timer hardware registers declared with `volatile`, so that the optimizer can make no assumptions. – Lundin Aug 22 '12 at 13:23
  • If it was just about integer type (2s complement vs. 1s complement vs. sign-magnitude) then it would have been left implementation-defined, rather than being completely undefined behaviour. The undefined behaviour was more about implementations that trap on overflow. – caf Aug 22 '12 at 13:54
  • @caf Yes that is correct, strictly speaking which one of 2's, 1's etc that is used is implementation-defined. But using various operators on implementations with or without negative zero etc is unspecified/undefined. Section 6.2.6.2 of the standard lists all combinations of these oddities. – Lundin Aug 22 '12 at 14:03
4

Because it is rarely ever needed. When would you actually need to detect integer overflow? In nearly all situations when you need to check some range, then it is usually you to define the actual range because this range depends entirely on the application and algorithm.

When do you really need to know if a result has overflown the range of int instead of knowing if a result is inside the allowed domain for a particular algorithm or if an index is inside the bounds of an array? It is you who gives your variables a semantic, the language specification only provides you the overall ranges of types and if you chose a type whose range doesn't fit your needs, then it's your fault.

Integer overflow is UB, because you seldom really care about it. If my unsigned char overflows during operations, I have probably chosen the wrong type for accumulating 10 million numbers. But knowing about the overflow during runtime won't help me, since my design is broken anyway.

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
  • 1
    There are many situations where the only upper bound on the size of numbers a program can accept in its input will be a function of integer overflow. When integer overflow occurs, there may not always be much a program can do about it, but triggering some kind of trap may be better than having the program overwrite what had been valid data with invalid data, and may ensure that someone gets notified of a problem before it leads to bigger problems down the road. – supercat Jan 10 '14 at 23:45
  • almost always when you need to operate both on numbers and gmt for instance -_- what kind of an answer is that? – Enerccio Dec 18 '22 at 10:37
3

Why? Well, because they weren't in C when C++ started from that, and because since then nobody proposed such functions and succeeded in convincing compiler makers and committee members that they are useful enough to be provided.

Note that compilers do provide such kinds of intrinsics, so it isn't that they are against them.

Note as well that there are propositions to standardize things like Fixed-Point Arithmetic and Unbounded-Precision Integer Types.

So it is probably just that there isn't just enough interest.

AProgrammer
  • 51,233
  • 8
  • 91
  • 143