Why is this an undefined behavior?

Question

My answer to this question was this function:

inline bool divisible15(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
    return x * 4008636143 <= 286331153;
}

It perfectly worked on my machine with VS2008 compiler, however here it doesn't work at all.

Does anyone has an idea, why it I get different results on different compilers? unsigned overflow isn't undefined behavior.

Important note: after some test it was confirmed it is faster than taking the remainder of the division by 15. (However not on all compilers)

It doesn't show as undefined behavior to me? It probably integer overflows though. — PherricOxide, Sep 09 '13 at 20:58
Nearly the maximum value of an `int` multiplied by an `int` clearly overflows an `int`. It seems that whatever compiler that is doesn't attempt to prevent overflows by using `long` for that step of the calculation. — millimoose, Sep 09 '13 at 20:59
I'd say it's undefined behaviour because the result of multiplying an `int` with an `int` should be again an `int`. Having the calculation be done using `long`s doesn't have much point if the result would still overflow the type of the expression itself. It seems that VS is more optimistic here and tries to preserve information just in case. — millimoose, Sep 09 '13 at 21:01
@millimoose Well... these are *unsigned* ints. The overflow behavior is specified. — Mysticial, Sep 09 '13 at 21:04
@millimoose: I jumped to that conclusion too, but overflow doesn't occur for unsigned ints... — Oliver Charlesworth, Sep 09 '13 at 21:04
"so it should be undefined behaviour.", or unspecified or implementation-defined behavior. — avakar, Sep 09 '13 at 21:04
@LoganMurphy now I see that, but that is wrong with unsigned int overflow in that compiler? — ST3, Sep 09 '13 at 21:04
Change it to `return x * (unsigned int)(4008636143) <= (unsigned int)(286331153);` — lapk, Sep 09 '13 at 21:14
For readability, I think this is better written as `return x * -(UINT_MAX / 15) <= (UINT_MAX / 15);`, which nicely avoids the whole problem. I got confused trying to understand it, realised it could be rewritten like this, and then understood why it worked :) (Edit: this, like your code, obviously assumes a specific value for `UINT_MAX`.) — , Sep 09 '13 at 22:38
(I can post a demonstration of why it actually checks if **`x`** is **`divisible by 15`**, if someone is interested) — Mmmh mmh, Sep 10 '13 at 06:52

Adam Rosenfield · Accepted Answer · 2013-09-09T22:21:32.280

It's not Undefined Behavior, it's just a breaking change in the C language standard between C89 and C99.

In C89, integer constants like 4008636143 that don't fit in an int or long int but do fit in an unsigned int are unsigned, but in C99, they are either long int or long long int (depending on whichever one is the smallest one that can hold the value). As a result, the expressions all get evaluated using 64 bits, which results in the incorrect answer.

Visual Studio is a C89 compiler and so results in the C89 behavior, but that Ideone link is compiling in C99 mode.

This becomes more evident if you compile with GCC using -Wall:

test.c: In function ‘divisible15’:
test.c:8:3: warning: this decimal constant is unsigned only in ISO C90

From C89 §3.1.3.2:

The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal: int, long int, unsigned long int; unsuffixed octal or hexadecimal: int, unsigned int, long int, unsigned long int; [...]

C99 §6.4.4.1/5-6:

5) The type of an integer constant is the first of the corresponding list in which its value can be represented.
Suffix | Decimal Constant | Octal or Hexadecimal Constant
-------+------------------+------------------------------
none   | int              | int
       | long int         | unsigned int
       | long long int    | long int
       |                  | unsigned long int
       |                  | long long int
       |                  | unsigned long long int
-------+------------------+------------------------------
[...]
6) If an integer constant cannot be represented by any type in its list, it may have an extended integer type, if the extended integer type can represent its value. If all of the types in the list for the constant are signed, the extended integer type shall be signed. [...]

For completeness, C++03 actually does have Undefined Behavior for when the integer constant is too big to fit in a long int. From C++03 §2.13.1/2:

The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int; if the value cannot be represented as a long int, the behavior is undefined. If it is octal or hexadecimal and has no suffix, it has the first of these types in which its value can be represented: int, unsigned int, long int, unsigned long int. [...]

The C++11 behavior is identical to C99, see C++11 §2.14.2/3.

To ensure that the code behaves consistently when compiled as either C89, C99, C++03, and C++11, the simple fix is to make the constant 4008636143 unsigned by suffixing it with u as 4008636143u.

+1; as an addendum, in C++, decimal constants without a suffix were always signed (even in C++03). — avakar, Sep 09 '13 at 21:20

Mats Petersson · Answer 2 · 2013-09-09T21:21:56.107

Since you are using int constants, the compiler can "use" the overflow is undefined to shortcut the code. If you modify with U as below, it "works".

inline bool divisible15(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
    return x * 4008636143u <= 286331153u;
}

testing with:

#include <iostream>


inline bool divisible15a(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
//    return x * 4008636143 <= 286331153;
    return x * 4008636143u <= 286331153;
}

inline bool divisible15b(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
//    return x * 4008636143 <= 286331153;
    return x * 4008636143 <= 286331153;
}


int main()
{
    for(unsigned int i = 0; i < 100; i++)
    {
    if (divisible15a(i))
    {
        std::cout << "a:" << i << std::endl;
    }
    if (divisible15b(i))
    {
        std::cout << "b:" << i << std::endl;
    }
    }
}

Output:

a:0
b:0
a:15
a:30
a:45
a:60
a:75
a:90

Code:

_Z12divisible15aj:
.LFB1192:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %eax
    imull   $-286331153, %eax, %eax
    cmpl    $286331153, %eax
    setbe   %al
    popq    %rbp
    ret

_Z12divisible15bj:
.LFB1193:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %edx
    movl    $4008636143, %eax
    imulq   %rdx, %rax
    cmpq    $286331153, %rax
    setle   %al
    popq    %rbp
    ret

So, yes, I agree with Carl/Adam's analysis that it doesn't fit in a 32-bit int, so therefore promoted to long or long long.

Default integer promotions already make that behave correctly, right? And the one constant that's too large gets 'upsized' on its own, IIRC? — Carl Norum, Sep 09 '13 at 21:07
It would seem at least g++ 4.6.3 behaves differently between using 'u' an not having it. May of course be a bug. — Mats Petersson, Sep 09 '13 at 21:09
@CarlNorum: Aha, is the issue that without the `u`, the literal gets interpreted as a `signed long`, and thus `x` gets promoted to `signed long` too? — Oliver Charlesworth, Sep 09 '13 at 21:12

Why is this an undefined behavior?

2 Answers2

Linked