26

In another question, the topic of std::numeric_limits<int>::is_modulo came up. But the more I think about it, the more it seems like something is wrong with the spec or with GCC or both.

Let me start with some code:

#include <limits>
#include <iostream>

bool test(int x)
{
    return x+1 > x;
}

int main(int argc, char *argv[])
{
    int big = std::numeric_limits<int>::max();

    std::cout << std::numeric_limits<int>::is_modulo << " ";
    std::cout << big+1 << " ";
    std::cout << test(big) << "\n";
}

When I compile this with g++ -O3 -std=c++11 (x86_64 GCC 4.7.2), it produces the following output:

1 -2147483648 1

That is, is_modulo is true, one plus INT_MAX is negative, and one plus INT_MAX is greater than INT_MAX.

If you are the sort of person with any realistic chance of answering this question, you already know what happened here. The C++ specification says that integer overflow is Undefined Behavior; the compiler is allowed to assume you do not do that; therefore the argument to x+1 cannot be INT_MAX; therefore the compiler may (and will) compile the test function to return true unconditionally. So far, so good.

However, the C++11 spec also says (18.3.2.4 paragraphs 60-61):

static constexpr is_modulo;

True if the type is modulo.222 A type is modulo if, for any operation involving +, -, or * on values of that type whose result would fall outside the range [min(),max()], the value returned differs from the true value by an integer multiple of max() - min() + 1.

On most machines, this is false for floating types, true for unsigned integers, and true for signed integers.

Note that section 5 paragraph (4) still reads, "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined." There is no mention of is_modulo == true creating an exception.

So it looks to me like the standard is logically contradictory, because integer overflow cannot simultaneously be defined and undefined. Or at the very least, GCC is non-conforming, because it has is_modulo as true even though signed arithmetic most certainly does not wrap around.

Is the standard buggy? Is GCC non-conforming? Am I missing something?

Community
  • 1
  • 1
Nemo
  • 70,042
  • 10
  • 116
  • 153
  • While I feel uncomfortable about it being simultaneously defined and undefined, gcc seems consistent to me: `max()+1 == min()`. – BoBTFish Nov 07 '12 at 15:48
  • 3
    @BoBTFish And how is that consistent then? `max()+1 == min()` and `min() > max()`? – R. Martinho Fernandes Nov 07 '12 at 15:50
  • 1
    I think the defined/undefined question is a red herring. The language doesn't require overflow to give defined behavior, but *if* the implementation wants to define the behavior, it's free to do so, and when it does so in a particular way, tells you that. – Jerry Coffin Nov 07 '12 at 16:16
  • 1
    @JerryCoffin: OK, but when GCC says `is_modulo` is true, doesn't that imply the behavior is defined, thus prohibiting the optimization in this case? – Nemo Nov 07 '12 at 16:20
  • @Nemo: I guess now I'll have to read your earlier question to figure out what optimization you're talking about... – Jerry Coffin Nov 07 '12 at 16:27
  • @JerryCoffin: I mean the optimization returning `true` in the `test` function in this question. Surely if arithmetic "is_modulo", INT_MAX+1 cannot be greater than INT_MAX? – Nemo Nov 07 '12 at 16:36
  • Interestingly enough, clang optimizes this out even with -O1. – Ivan Nov 07 '12 at 16:55
  • From the point of modulo arithmetic, this can be seen as a bug, but I see how someone could argue that the undefined behaviour expands to comparison. – Ivan Nov 07 '12 at 17:05
  • @Nemo: I don't think that would necessarily follow. I would think that having `is_modulo` returns true should forbid a compiler from having integer overflow violate the laws of time and space the way some compilers like to do, but I don't think anything would forbid a compiler from having 2^33+1 values of 32-bit `int`: the 2^32 normal ones, a totally-indeterminate one where every read may behave as an arbitrary number of any size (not limited to 32 or even 128 bits), and a 2^32 semi-indeterminate ones, where every read may yield a value which differs from the correct value... – supercat Apr 16 '15 at 21:19
  • ...by any arbitrary multiple of 2^32. Such a rule would give compiler writers the latitude to assume that `i+1 > i` for all integers (allowing loop optimizations), but would still yield correct behavior from many programs written for older systems especially if a platform specified that explicit typecasts would strip off any "excess" precision. – supercat Apr 16 '15 at 21:27
  • @supercat: I do not think so. For `INT_MAX` to have any meaning at all, it must be true that if `x` is an `int` with any value whatsoever, `INT_MAX >= x` must be true. That is what "maximum" means, by definition, and it does not appear to be obeyed by your hypothetical implementation. – Nemo Apr 17 '15 at 03:40
  • @Nemo: If a 64-bit machine has only 64-bit arithmetic instructions (adding 32-bit versions would waste a bit of opcode which could better be used for other purposes) but uses 32-bit `int` to avoid wacky behavior when multiplying values of type `uint32_t`, then `INT_MAX` would be the largest integer value whose behavior is *fully defined*. If an `int` stored in memory equals `INT_MAX`, adding 1 may naturally yield `INT_MIN`, but if an `int` is stored in a register, it would more naturally yield `INT_MAX+1` unless the compiler adds additional instructions to make it wrap. – supercat Apr 17 '15 at 15:20
  • @Nemo: It's also possible that a value might be stored in memory but cached in a register, in which case its value may sometimes appear to be `INT_MAX+1` and sometimes `INT_MIN`. I would suggest that the best compromise between speed and usefulness would be to say that it must behave as a value whose bottom 32 bits are correct, but which may have additional upper bits with values that may or may not be meaningful. Note that such a rule is extremely generous by today's hyper-modern standards, given that a hyper-modern compiler that knows that `y` equals `INT_MAX` could... – supercat Apr 17 '15 at 15:23
  • ...turn `if (should_launch_missiles) launch_nuclear_missiles(); y++;` into `launch_nuclear_missles();` even if `should_launch_missiles()` is zero, since there is nothing worse than integer overflow, so it won't matter if other functions get called first. *I'M NOT KIDDING*. – supercat Apr 17 '15 at 15:26
  • PS--If I had my druthers, I would add as I mentioned a constraint that a typecast to `(int)` would direct a compiler to convert the bottom 32 bits of a value into an `int` assuming two's-complement semantics, such that `INT_MAX >= (int)x;` would be true for all x, even though `INT_MAX >= x;` would not necessarily be true for all `x` of type `int`. Don't-care values are often very good for performance; too bad today's hyper-modern philosophy forces programmers to care about things they don't care about, as the only way to guard against compilers arbitrarily rewriting things the programmers do. – supercat Apr 17 '15 at 15:34

1 Answers1

18

If is_modulo is true for a signed type (e.g. int) that is unchanged by the usual arithmetic conversions, then for any arithmetic operation other than division by zero there is a single correct result in the (mathematical) integers that modulo maps to a single value in the range of the type, and so the implementation is constrained to behave as if the actual result is the true result modulo the range of the type. So if an implementation wants to retain overflow arithmetic as being undefined it must set is_modulo to false.

This was discussed ad nauseam on the gcc mailing list and then under PR 22200 with the eventual conclusion that the value of is_modulo should be false for signed types; the change was made to libstdc++ in April of this year.

Note that in C++03 the language was significantly different:

18.2.1.2 numeric_limits members [lib.numeric.limits.members]

56 - [...] A type is modulo if it is possible to add two positive numbers and have a result that wraps around to a third number that is less.

Given that with undefined behaviour anything is possible, it is arguable that the previous behaviour of libstdc++ (having is_modulo as true) was correct and consistent with the behaviour of g++; the earlier discussions on the linked PR should be read with this in mind.

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • Yeah, I should have mentioned the C++03 spec as well. Thank you for this; PR 22200 is exactly the sort of thing I was looking for. (Also good to know it is fixed. I did first test on the latest GCC I could find, which is 4.7.2 on Ubuntu.) – Nemo Nov 07 '12 at 17:43
  • To clarify, we are saying that the result `1 -2147483648 1` is definitely bugged in C++11 (but not in C++03), and that `is_modulo == true` means that the behaviour on overflow is NOT undefined ? – M.M Jun 29 '14 at 05:54
  • @MattMcNabb: Yes, the result is arguably bugged in C++03 (although that is debatable) and definitely bugged in C++11, according to the GCC developers. Newer GCCs set `is_modulo` to `false` for this reason. In their opinion (and mine), `is_modulo == true` implies that overflow is defined and wraps around a la `-fwrapv`. I do not know what other compilers do. – Nemo Jun 29 '14 at 21:56