How to enable compiler warning when comparing char and unsigned char?

Question

take for example the following C code :

int main(int argc, char *argv[])
{
    signed char i;
    unsigned char count = 0xFF;

    for (i=0; i<count;i++)
    {
        printf("%x\n", i);
    }
    return 0;
}

This code runs in an infinite loop, even if I compile it as follows :

# gcc -Wall -Wpedantic -Wconversion -Wsign-compare -Wtype-limits -Wsign-conversion test.c -o test

Does someone know for a compiler flag that should warn about those kind of issues ?

Just to be clear, I'm not asking 'why is get an infinite loop', but to know if there's a way to prevent it using a compiler or static analysis ?

Aren't the -W options just warnings? Your code runs forever because a signed char can't go above 127 before it wraps around (or commits some undefined behavior) therefore it can't exceed your loop terminating condition. — nicomp, Oct 31 '16 at 11:48
Seems a lot like a homework question to me. Needless type mismatch in a program so small? Use of `signed char` at all? — John Bollinger, Oct 31 '16 at 11:48
the signed char will always be less than 255. therefore it won't exit the loop. — Taha Paksu, Oct 31 '16 at 12:16
One note in addition to existing answer: it's true you cannot _detect_ a signed integer overflow but with gcc you can use `-fwrapv` to, at least, **avoid the UB** (but your code is still broken unless compiler use it for some optimization which can be detected by `-wstrict-overflow`. In short: you can detect it if you're aware of it (many assign-with-overflow-check implementations out there) but AFAIK if you do it by mistake then you're just bitten — Adriano Repetti, Oct 31 '16 at 13:58
@AdrianoRepetti: Unless `signed char` has the same range as `int`, incrementing past `CHAR_MAX` would apply an implementation-defined conversion method. — supercat, Oct 31 '16 at 20:11
@supercat isn't what -fwrapv should avoid (imp def -> wrap)? Unless hmmmmm it's still UB for C standard (and then gcc frontend) but it makes it allowed only for gcc backend? I usually consider gcc as a whole (fe+be), is it wrong? — Adriano Repetti, Nov 02 '16 at 10:07
@AdrianoRepetti: If `CHAR_MAX` < `INT_MAX`, promotion to `int` followed by addition would yield a defined result less than `INT_MAX`; conversion of values between `CHAR_MAX+1` and `INT_MAX` to `signed char` is required to either yield an Implementation-Defined value or raise an Implementation-Defined signal. — supercat, Nov 02 '16 at 14:14
Possible duplicate of [Why does integer overflow on x86 with GCC cause an infinite loop?](http://stackoverflow.com/questions/7682477/why-does-integer-overflow-on-x86-with-gcc-cause-an-infinite-loop) — phuclv, Nov 03 '16 at 05:29
@LưuVĩnhPhúc I'd argue it's not a duplicate because it has a focus on compiler not giving a warning rather than reasons behind the code behaviour ([and totally not because I have an answer here](http://tvtropes.org/pmwiki/pmwiki.php/Main/SuspiciouslySpecificDenial)). — ivan_pozdeev, Nov 03 '16 at 16:49

2501 · Answer 1 · 2016-11-01T11:39:09.917

12

The flag -Wconversion won't catch the error, because both operands in the comparison: i<count, get promoted to int using integer promotions.

There is no flag in gcc that would catch this.

That aside, the behavior of your code is undefined, because variable i overflows, when it has the value 0x7F and is incremented: i++.

If you want to iterate up to some value, make sure the type you're using can represent that value.

edited Nov 01 '16 at 11:39

answered Oct 31 '16 at 11:49

2501

25,460
4
47
87

13

I know that variable i overflows, this is exactly what I wanted to achieve. I'm not asking 'why' it runs in an endless loop, and how to fix it. I would like to know if there's a way that the compiler will catch this issue ?? – sagi Oct 31 '16 at 11:56
1

Thanks @2501, I went over gcc man page for all the warnings flag and could not find one. – sagi Oct 31 '16 at 11:59
1

@sagi You could try a third-party static analyzer. – 2501 Oct 31 '16 at 12:01
1

clang-analyzer with the above gcc command does not find anything – sagi Oct 31 '16 at 12:08
1

@usr: While true in itself, the fact that compiler warnings DO point out undefined behavior here and there and in this case a warning would be most helpful. – Matthieu M. Oct 31 '16 at 15:02
@MatthieuM. I don't deny a warning would be useful. But it's not always possible for a compiler. Such is the realm of undefined behaviours that it requires the programmer to have that knowledge. – P.P Oct 31 '16 at 15:08
1

@usr: Oh I know, I've seen gcc be (too) clever with `-Wtype-limits` though so I am surprised it didn't fire here. – Matthieu M. Oct 31 '16 at 16:08
@usr: Unless CHAR_MAX==INT_MAX, there is no Undefined Behavior. If `CHAR_MAX` and `i` are both 127, then `i++` will start by converting `i` to the `int` value 127, adding 1 (yielding 128), and the converting that value to `signed char`. The Standard defines that conversion as either yielding an implementation-defined value or raising an implementation-defined signal. It would not allow UB unless `CHAR_MAX` was the same as `INT_MAX` [which is required to be at least 32767]. – supercat Oct 31 '16 at 20:15
1

@supercat Postfit increment doesn't do integer promotions. The operand isn't not promoted to int and thus overflows. – 2501 Oct 31 '16 at 22:56
@2501: What in the Standard would justify that the arithmetic for the increment is not performed in the same manner as for operand+=1, which would in turn do the arithmetic as operand=operand+1, which clearly would promote? – supercat Nov 01 '16 at 13:29
@supercat The rules. Postfix ++ and += are different operators. At no point does the standard say that postfix ++ performs promotions, unlike other operands where it is explicitly mentioned that it does. Informative 6.3.1.1 §2 explains that integer promotions only apply to some operands: *The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclauses.* Unary postfix ++ is missing from this list. – 2501 Nov 01 '16 at 17:07
@2501: Would you agree that the Standard makes it unclear whether the authors had even *considered the question* of whether employing `++` on a short signed type holding its maximum value should behave as though it promoted the operand and then converted the result? – supercat Nov 02 '16 at 15:47
@supercat I think any such question is just speculation. – 2501 Nov 02 '16 at 15:51
@2501: Let me rephrase: is there any affirmative evidence to suggest that the authors of the Standard had considered the issue and had any particular feelings on the matter? – supercat Nov 02 '16 at 15:58
@supercat [Rationale for International Standard—Programming Languages—C](http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf), section 6.3.1.1, discusses exactly the case of comparing signed and unsigned and their reasons behind choosing the "value preserving" approach. In the same section and elsewhere, they mention that silent overflow is the standard semantic for implementations. – ivan_pozdeev Nov 03 '16 at 04:10
@ivan_pozdeev: I've read that document. It would suggest that the authors expected that on commonplace implementations, something like `unsigned mul(unsigned short x, unsigned short y) { return x*y; }` should be safe. Since the Standard doesn't require that such a thing be safe even on silent-wraparound hardware, however, gcc will sometimes silently generate code for that function which has unwanted side-effects if the product exceeds 2147483647. – supercat Nov 03 '16 at 14:16

score 7 · Accepted Answer · answered Oct 31 '16 at 13:45

i is a signed char, incrementing it beyond SCHAR_MAX has an implementation defined effect. The computation i + 1 is performed after promotion of i to int and it does not overflow (unless sizeof(int) == 1 and SCHAR_MAX == INT_MAX). Yet this value is beyond the range of i and since i has a signed type, either the result is implementation-defined or an implementation-defined signal is raised. (C11 6.3.1.3p3 Signed and unsigned integers).

By definition, the compiler is the implementation, so the behavior is defined for each specific system and on x86 architectures where storing the value results in masking the low-order bits, gcc should be aware that the loop test is definitely constant, making it an infinite loop.

Note that clang does not detect the constant test either, but clang 3.9.0 will if count is declared as const, and it does issue a warning if i < count is replaced with i < 0xff, unlike gcc.

Neither compiler complains about the signed / unsigned comparison issue because both operands are actually promoted to int before the comparison.

You found a meaningful issue here, especially significant because some coding conventions insist on using the smallest possible type for all variables, resulting in such oddities as int8_t or uint8_t loop index variables. Such choices are indeed error-prone and I have not yet found a way to get the compiler to warn the programmer about silly errors such as the one you posted.

You seem to be talking about addition i+1, yet OP is using a postfix ++ operator which does not do integer promotions. An overflow and ub will happen. — 2501, Oct 31 '16 at 13:51
@2501: I looked for relevant language in the C11 Standard to support this and could not find anything conclusive. Can you elaborate on why you assume `i++` should behave differently from `i += 1` and/or `i = i + 1`? — chqrlie, Oct 31 '16 at 13:54
You're assuming that postfix ++ behaves the same as +=. It doesn't. Prefix ++ does because it is explicity mentioned. — 2501, Oct 31 '16 at 13:58
@2501: I asked a separate question for this. The answer deserves precise references from the Standard. — chqrlie, Oct 31 '16 at 14:07

skrtbhtngr · Answer 3 · 2016-10-31T16:12:54.450

3

Since i is a signed char, its value ranges from -128 to 127 typically. Whereas count being an unsigned char is assigned the value 255 (0xFF).

Inside the loop, when i value gets to 127 and is incremented again, it never reaches 128 and gets to -128 and then gets again to 127 and then again rolls over to -128, and so on. The value of i will be forever less than the value of count inside the loop and so the loop can never terminate!

This is happening because of the overflow of the data type and care must be taken to critically examine the expressions where automatic type coercions may take place as they will not issue any warning.

EDIT: From the GCC documentation,

-Wconversion: Warn for implicit conversions that may alter a value.

Here, we are getting inconsistency due to comparison and not assignment.

edited Oct 31 '16 at 16:12

answered Oct 31 '16 at 11:56

skrtbhtngr

2,223
23
29

7

I know that variable i overflows, this is exactly what I wanted to achieve. I'm not asking 'why' it runs in an endless loop, and how to fix it. I would like to know if there's a way that the compiler will catch this issue ?? – sagi Oct 31 '16 at 11:57
Worth to mention: as all values of the signed char can't be represented in the unsigned char (and vice versa), both operands of the comparison operator are converted to the next int category that can hold them both, being int. The sign preservation of that conversion is guaranteed, so that one end up comparing -128..+127 with 255 which will always be true. – Christophe Oct 31 '16 at 12:01
2

@sagi: Since the type conversion in the condition part of for loop promotes a narrower type (signed char) into a wider type (unsigned char), this conversion will not be reported by the compiler. – skrtbhtngr Oct 31 '16 at 12:06
1

Actually, both of them will be converted to `int` in comparison, but that is trivial in this context. – skrtbhtngr Oct 31 '16 at 13:12
If this answer seems incorrect or inappropriate to anyone, please post a comment along with the downvote! – skrtbhtngr Oct 31 '16 at 14:09
even when i doesn't overflow, the expression `i – phuclv Nov 04 '16 at 17:16

ivan_pozdeev · Answer 4 · 2016-11-03T05:34:13.097

There's no warning to catch this because there's nothing questionable in this code from compiler's standpoint.

As pointed out in other answers, the comparison line is effectively treated as
```
for (i=0; (int)i<(int)count; i++)
```
- As described in Rationale for International Standard—Programming Languages—C, section 6.3.1.1, they chose the "value preserving" approach for implicit type conversions when comparing values because it has less potential for unexpected comparison results.
  - Note that your program runs "incorrectly" specifically because the result of arithmetic comparison is the expected one.
- If these were ints, that "value preserving" semantic would be impossible to achieve - because an int is as a rule a hardware type (or at least, C is designed with this in mind), and there are few (if any) architectures that allow one operand to be signed while the other unsigned. That's the core reason for a compiler issuing a "comparison between signed and unsigned" warning if you replace chars with ints here.
For the increment operation, i++,
- the same above-linked rationale mentions in a few places that "silent overflow" is by far the most common and expected semantic. It even stated to have caused "a lack of sensitivity in the C community to the differences between signed and unsigned arithmetic" in the first place! So, nothing suspicious here, either.

Ultimately, what caused this confusion is your obliviousness to the int promotion semantic. Which caused the behaviour of the code to be unexpected for you (don't feel bad, I didn't know this before reading the other answers, either!). Yet, it turns out, it's quite expected for the standard - moreover, dictated by it. "Ignorance of the law is no excuse", as they say.

How to enable compiler warning when comparing char and unsigned char?

4 Answers4