1

Consider the following C program:

#include <stdio.h>

int main(int argc, char *argv[]) {
        int a = -5000000000;
        int b = -3000000000;
        int c = -1000000000;
        int d =  1000000000;
        int e =  3000000000;
        int f =  5000000000;

        printf("a = %d\n", a);
        printf("b = %d\n", b);
        printf("c = %d\n", c);
        printf("d = %d\n", d);
        printf("e = %d\n", e);
        printf("f = %d\n", f);

        return 0;
}

Consider also this environment output:

pvz@DESKTOP-OTTHA70:~$ uname -a
Linux DESKTOP-OTTHA70 4.19.104-microsoft-standard #1 SMP Wed Feb 19 06:37:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
pvz@DESKTOP-OTTHA70:~$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

And this compiler output:

pvz@DESKTOP-OTTHA70:~$ gcc -Wall -o overflow overflow.c
overflow.c: In function ‘main’:
overflow.c:4:10: warning: overflow in implicit constant conversion [-Woverflow]
  int a = -5000000000;
          ^
overflow.c:5:10: warning: overflow in implicit constant conversion [-Woverflow]
  int b = -3000000000;
          ^
overflow.c:9:11: warning: overflow in implicit constant conversion [-Woverflow]
  int f =  5000000000;
           ^~~~~~~~~~

And this program output:

pvz@DESKTOP-OTTHA70:~$ ./overflow
a = -705032704
b = 1294967296
c = -1000000000
d = 1000000000
e = -1294967296
f = 705032704

On this machine, an int is 32 bits wide, giving it a range of -231 (-2147483648) through 231 - 1 (2147483647) inclusive.

As expected, trying to put a value that doesn't fit within the range into the variable causes a warning, and the most significant bits of the value are truncated, leading to unexpected values in the integer variables a, b, e and f.

However, the compiler only warns us for a, b and f, and not for e.

The question therefore is, why doesn't gcc warn about the following line?

int e = 3000000000;

As far as I can tell, values from 231 to 232 - 1 surprisingly enough aren't actually warnings if you try to cram them into an int. I'm also doubtful this is a compiler bug, since this seems like something that would have been found since a long time. That means it's likely intentional behavior. But why?

  • 2
    With `-Wall -Wextra -pedantic-errors`, gcc warns for all: https://godbolt.org/z/n-FLKk – P.P Jun 28 '20 at 22:05
  • Turn warnings on. – Tony Tannous Jun 28 '20 at 22:08
  • Oops, I did mean to add -Wall as a compile time flag, but it must have disappeared while I was editing the post. That's been fixed in an edit of the post. Adding -Wpedantic does however seem to add the warnings I'm missing. – Per von Zweigbergk Jun 28 '20 at 22:11
  • My guess is that it only warns for *literals* if they couldn't be used for an int *or unsigned int*, without actually checking where it's assigned. – Karl Knechtel Jun 28 '20 at 22:15
  • Can confirm for gcc v10.1.0 -Wall gcc warns for variable f, but only with -pedantic-errors it also warns for variable e. – Teharez Jun 28 '20 at 22:16
  • 1
    In code designed for two's complement platforms, it's not too uncommon to intentionally write a positive value that wraps around and becomes negative, e.g. `int mask = 0xffff0000`. The gcc developers probably figured that warning about such cases would give too many false positives. – Nate Eldredge Jun 28 '20 at 22:45
  • 1
    Ok, checked the source code. Man they have a weird indentation convention. [This](https://github.com/gcc-mirror/gcc/blob/releases/gcc-10.1.0/gcc/c-family/c-warn.c#L1356-L1456) is the function which prints the overflow warnings. I won't even try to explain it. Too many macros and too less code documentation for my liking^^ – Teharez Jun 28 '20 at 23:24
  • @Teharez That looks like it's the GNU formatting convention, as used in some GNU projects, such as Emacs. https://www.gnu.org/prep/standards/html_node/Formatting.html - I agree though that it would have been nice if the code commented why it was doing what it was doing. Still, that does seem to confirm it's intentional, but the "why" still eludes me. – Per von Zweigbergk Jun 29 '20 at 06:08
  • @NateEldredge That does seem plausible, although you could argue that if you're storing a bitmask you probably should be storing it in an unsigned type. – Per von Zweigbergk Jun 29 '20 at 06:10
  • 1
    @Teharez I've made some progress doing some digging, and it seems that none other than Richard Stallman has made this commit back in 1993. With no reasoning behind it, as seems to be common for commit messages of the time. Not sure if that gets us any closer. https://github.com/gcc-mirror/gcc/commit/22ba338b8f7f2b198f14978125571c2d8a7211b6 – Per von Zweigbergk Jun 29 '20 at 06:30
  • @PervonZweigbergk: Certainly you *should*, but lots of older code exists that uses signed types. The programmers knew that their implementations would do the "right" thing, and so would have seen no reason to use one over the other, and `int` is less typing than `unsigned`... Deciding which warnings go into `-Wall` is always a judgment call, so I think that's just a compromise they chose to make. As you noted, warnings for your case are available if you choose to turn them on. – Nate Eldredge Jun 29 '20 at 14:20
  • 1
    @PervonZweigbergk: The commit remark mentioning `char` is a good hint, though, since this situation is likely even more common with `char` than with `int`. For one thing, there's the issue that `char` is signed in some implementations and unsigned in others. But most programmers tend to think of character values above `0x7f` as positive rather than negative, so `char c = 0xff;` would indeed be pretty common. – Nate Eldredge Jun 29 '20 at 14:26
  • @NateEldredge Oh, that does make sense now that you mention that. If the author's intent was to check specifically for sticking >0x7F values into chars (which can be signed or unsigned depending on the implementation) and not warn for those cases, the "fix" turned out to be more general. While this likely would have been considered a bug in 1993 (or at least a poor judgement call imo), by now it's probably a feature. :-) (Another option would have been to not warn for hex values, but to do warn for decimal values.) I'll write up an answer based on this discussion. – Per von Zweigbergk Jun 29 '20 at 14:38

1 Answers1

1

From some investigation done in the question comments, we found that this behaviour dates back to at least to October 30, 1993 in this commit by Richard Stallman with this commit message:

"(convert_and_check): Don't warn converting 0xff to a signed char, etc., unless pedantic."

From-SVN: r5944

It would seem the authors intent was to suppress warnings for the case where large hexadecimal values are stuck into chars on machines where char is signed. (char can be signed or unsigned depending on implementation according to the C standard.)

But in doing so, it also did it for the more general case of any signed integer type, and also for decimal representations, not just hexadecimal or octal representations.

In my personal opinion, this is probably a misfeature (and I would probably only suppress warnings for hexadecimal or octal representations), but that's a judgement call that I think the gcc maintainers are far more qualified to make than myself.