2

sscanf(3) says (emphasis mine):

i
Matches an optionally signed integer; the next pointer must be a pointer to int. The integer is read in base 16 if it begins with 0x or 0X, in base 8 if it begins with 0, and in base 10 otherwise. Only characters that correspond to the base are used.

However GCC does not complain about using an unsigned int with %i unless -pedantic is given. This is different from the behavior I'm used to, where GCC will warn for any mismatched type and format string.

Why does this combination behave differently?

Given that this warning is not included in the common -Wall set of warnings, is it acceptable to pass unsigned int to %i?


Example program:

#include <stdio.h>

int main(void)
{
    int i;
    unsigned int u;
    float f;

    scanf("%i", &i);
    scanf("%i", &u);
    scanf("%i", &f);

    return 0;
}

Without -pedantic, GCC complains about %i and float *, but not unsigned int *:

$ gcc -Wall -Wextra scanf_i.c 
scanf_i.c: In function ‘main’:
scanf_i.c:11:13: warning: format ‘%i’ expects argument of type ‘int *’, but argument 2 has type ‘float *’ [-Wformat=]
     scanf("%i", &f);
            ~^   ~~
            %e

With -pedantic, GCC complains about both: Output with -pedantic:

$ gcc -Wall -Wextra -pedantic scanf_i.c 
scanf_i.c: In function ‘main’:
scanf_i.c:10:13: warning: format ‘%i’ expects argument of type ‘int *’, but argument 2 has type ‘unsigned int *’ [-Wformat=]
     scanf("%i", &u);
            ~^   ~~
            %i
scanf_i.c:11:13: warning: format ‘%i’ expects argument of type ‘int *’, but argument 2 has type ‘float *’ [-Wformat=]
     scanf("%i", &f);
            ~^   ~~
            %e

GCC version:

$ gcc --version
gcc (Debian 8.3.0-6) 8.3.0
Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • I almost always use `-Wpedantic` to avoid unexpected compiler extensions anyway. "_...is it acceptable to pass `unsigned int` to `%i`_?" -- rather than look at manpages, go straight to the source: ["_The corresponding argument shall be a pointer to signed integer._"](http://port70.net/~nsz/c/c11/n1570.html#7.21.6.2p12) Passing `unsigned int` to `%i` leads to undefined behavior and should be avoided. – ad absurdum Feb 29 '20 at 03:50
  • I was also trying to figure out if this was actually UB. Would it be more correct to `sscanf` into an `int` and then [assign that to an `unsigned int`](https://stackoverflow.com/q/2711522/119527)? – Jonathon Reinhart Feb 29 '20 at 04:11
  • Sure, that would be fine. In some circumstances it might also be convenient to take input as a string and use [`strtoul` with a base argument of 0](http://port70.net/~nsz/c/c11/n1570.html#7.22.1.4p5). – ad absurdum Feb 29 '20 at 04:23
  • See C11 draft standard n1570, *6.2.5 Types 9 The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. 41)*, where footnote: *41) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.* while the footnote is non-normative, the consequence of the normative rule is that the warning is only sound for numbers that are not representable in either type. – EOF Feb 29 '20 at 14:16
  • I should have asked the question "can I pass an `unsigned int` for `%i`?" – Jonathon Reinhart Feb 29 '20 at 14:55
  • @EOF -- "_the warning is only sound for numbers that are not representable in either type_": a compiler diagnostic is only required for constraint violations. There is no constraint violation in passing `unsigned int` to `%i`, but it is still UB as: ["_If a ''shall'' or ''shall not'' requirement that appears outside of a constraint or runtime- constraint is violated, the behavior is undefined._"](http://port70.net/~nsz/c/c11/n1570.html#4p2) – ad absurdum Feb 29 '20 at 14:59
  • @exnihilo I'm not talking about constraint violations, neither compiletime nor runtime. I'm talking about the fact that passing a `unsigned *` where an `int *` is expected is explicitly **not undefined in some cases**, namely when the value referenced by the pointer, written as one of the types and read as the other, is representable in both. As the part of the standard I quoted describes. – EOF Feb 29 '20 at 15:08
  • @EOF -- it is explicitly undefined in the case of OP question. – ad absurdum Feb 29 '20 at 15:09
  • @exnihilo Iff the input corresponding to the `"%i"` is not representable by both `int` and `unsigned` (which is user-controlled, and thus *should* always be checked). Otherwise, no UB here, and gcc may not want to warn for this by default because false positive warnings decrease trust in tools. – EOF Feb 29 '20 at 15:17
  • @EOF -- "_Otherwise, no UB here_" -- that is incorrect. The Standard says explicitly, as I have already quoted: ["_The corresponding argument **shall** be a pointer to signed integer_"](http://port70.net/~nsz/c/c11/n1570.html#7.21.6.2p12) for `%i`, and ["_If a **''shall''** or ''shall not'' requirement that appears outside of a constraint or runtime- constraint is violated, the behavior is undefined._"](http://port70.net/~nsz/c/c11/n1570.html#4p2) – ad absurdum Feb 29 '20 at 15:27
  • @EOF -- You can't be much more explicit than this. Representability is not the issue here. – ad absurdum Feb 29 '20 at 15:27
  • @exnihilo Well, I suppose this depends on whether the specific requirements of the I/O library do (or even *can*) override the fundamental rules of the language, as per *6.5 Expressions 7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) — a type compatible with the effective type of the object, — a qualified version of a type compatible with the effective type of the object, [...]* and the section I previously quoted, which *explicitly* describes this as **interchangeable**. – EOF Feb 29 '20 at 15:35

1 Answers1

2

-Wformat-signedness controls whether warnings are raised when the argument type differs only in signedness from what's expected.

From the gcc(1) man page:

-Wformat-signedness
If -Wformat is specified, also warn if the format string requires an unsigned argument and the argument is signed and vice versa.

The man page does not explicitly state that this flag is included in -pedantic, (-Wpedantic) but it does say:

However, if -Wpedantic is used with -Wformat, warnings are given about format features not in the selected standard version.

Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • Both the question and answer provide good information about the overly misused `%i` format specifier. – David C. Rankin Feb 29 '20 at 03:50
  • @David Why do you feel it is overly misused? In my case, I'd like to scan either a decimal or hex `unsigned int`. But no standard library function can easily do this. – Jonathon Reinhart Feb 29 '20 at 04:12
  • Because with a number (a large number) of new user questions it is used as a magical integer specifier for all types regardless of the signedness of the variable to be filled. Especially where the user is using it to read hex formatted input. – David C. Rankin Feb 29 '20 at 07:46