37

According to the C standard (6.5.2.2 paragraph 6)

If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:

  • one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
  • both types are pointers to qualified or unqualified versions of a character type or void.

Thus, in general, there is nothing wrong with passing an int to a variadic function that expects an unsigned int (or vice versa) as long as the value passed fits in both types. However, the specification for printf reads (7.19.6.1 paragraph 9):

If a conversion specification is invalid, the behavior is undefined. If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

No exception is made for signed/unsigned mismatch.

Does this mean that printf("%x", 1) invokes undefined behavior?

M.M
  • 138,810
  • 21
  • 208
  • 365
R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 2
    People interested in this question might (or might not) be interested in this related question: http://stackoverflow.com/questions/4586962/what-is-the-purpose-of-the-h-and-hh-modifiers-for-printf – Michael Burr Jan 12 '11 at 00:29
  • How can a function with arguments be "defined with a type that does not include a prototype"? Is that related to K&R-style stuff? – aschepler Jan 12 '11 at 00:59
  • 2
    And what about `printf("%d",(char)1);`. The description of `printf` doesn't say that it's the argument *after integer promotions* which must be the correct type, it says the argument itself must be. Should we conclude that it's an exception to that part of 6.5.2.2/6 as well? – Steve Jessop Jan 12 '11 at 01:03
  • 1
    Btw, I think your quote is insufficient to illustrate the problem, since it *is* undefined behavior to call `printf` if it hasn't been prototyped, and your quote concerns calls made where there is no prototype. The same argument promotions are applied to the arguments of varargs, though, according to 6.5.2.2/7, although that doesn't say anything about signed/unsigned compatibility. So maybe you're absolutely right, and signed/unsigned compatibility is only stated to apply to calls made with no prototype, not to varargs calls in general, let alone `printf` in particular. – Steve Jessop Jan 12 '11 at 01:15
  • @aschepler: it doesn't say "defined with no prototype". It says "the expression that denotes the called function" doesn't include a prototype. For example if you declare `void foo();`, then do `foo(1)`, "the expression that denotes the called function" is `foo`, and its type does not include a prototype. The definition of `foo` will introduce a prototype, perhaps in a different translation unit, but `foo` doesn't have one at the call point. – Steve Jessop Jan 12 '11 at 01:31
  • 4
    If I am right, I think it's a defect in the standard and probably should be fixed. This strict interpretation would render huge volumes of code incorrect and require equally huge volumes of ugly and meaningless casts... – R.. GitHub STOP HELPING ICE Jan 12 '11 at 01:33
  • @Steve: Yes, that's how I read the first quoted sentence too. But it's the last sentence before the bullet point that is particularly confusing me. – aschepler Jan 12 '11 at 03:03
  • @Steve Jessop: I think that it's the only reasonable interpretation(!) to assume that the conversions mandated in the specification for a function _call_ expression are applied before the types for the arguments to a function are determined. – CB Bailey Jan 12 '11 at 08:42
  • 1
    When the function call expression is a call of a function with a `,...` prototype, the only part of 6.5.2.2/6 that is relevant is the description of _default argument promotions_. The mismatched arguments exceptions are not applicable. (In any case, the function must be defined with a matching prototype and the `...` parameters don't have a known type.) The corresponding requirements for accessing vargs are in 7.15.1.1 which describes the use of the `va_arg` macro. Here you are allowed to use `va_arg` to access (e.g.) an `int` as an `unsigned int` providing the value is in the correct range. – CB Bailey Jan 12 '11 at 09:09
  • 1
    @Charles: OK, so varargs in general is alright, and R.'s objection amounts to saying that "printf" should perhaps specify that it reads its arguments using the varargs macros, or that the arguments must be such that they could be read using the varargs macros, rather than inaccurately restating the conditions under which it can go wrong. – Steve Jessop Jan 12 '11 at 11:33
  • The main *practical* thing that's unclear to me is if there are other consequences of the condition stated for `printf`, i.e. if it's intended to mean that the *value* passed must be a valid value for the specified type (prior to default promotions and possible signedness mismatch in the resulting type). Of course that goes more with the other question. – R.. GitHub STOP HELPING ICE Jan 12 '11 at 18:28
  • 1
    On the surface, `unsigned short x = 1; printf("%hu\n", x);` would also appear to be UB due to the unsigned / signed mismatch introduced by integer promotions, even though most people reading it would probably not expect it. – dbush Oct 28 '21 at 15:19
  • @dbush: I don't think so, because `%hu` expects an argument whose type after default promotions is the type of `unsigned short` with default promotions applied, which (assuming `int` wider than `short`) is not an unsigned type. – R.. GitHub STOP HELPING ICE Oct 29 '21 at 02:13

6 Answers6

17

I believe it is technically undefined, because the "correct type" for %x is specified as unsigned int - and as you point out, there is no exception for signed/unsigned mismatch here.

The rules for printf are for a more specific case and thus override the rules for the general case (for another example of the specific overriding the general, it's allowable in general to pass NULL to a function expecting a const char * argument, but it's undefined behaviour to pass NULL to strlen()).

I say "technically", because I believe an implementation would need to be intentionally perverse to cause a problem for this case, given the other restrictions in the standard.

caf
  • 233,326
  • 40
  • 323
  • 462
  • I think this interpretation implies that the standard intends the `printf` family of functions to have their arguments passed in a different way than other variadic functions, which would make no sense. – Chris Lutz Jan 12 '11 at 01:11
  • 2
    @Chris Lutz: This interpretation implies nothing about the *intent* of the standard, it merely puts forward a line of argument about the effect of the actual normative wording of the standard. – caf Jan 12 '11 at 01:30
  • "it's undefined behaviour to pass NULL to strlen())." But that isn't making a special case; it's undefined behaviour to dereference a null pointer, and strlen dereferences the pointer it is given. The act of passing null to strlen() isn't undefined, although it results in an UB-causing action later with certainty. – Karl Knechtel Jan 12 '11 at 04:52
  • 2
    @Karl: actually it is the act of passing `NULL` to `strlen` that's undefined. This is because standard library functions are defined formally by their behavior and not by a C implementation. See 7.4.1/1: "If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined." – R.. GitHub STOP HELPING ICE Jan 12 '11 at 07:39
  • 1
    @caf: In the last few years, implementations have become increasingly intentionally perverse. I don't think programming in C will be safe unless or until someone writes a standard which establishes helpful normative behaviors and requires perverse compilers to document departures from the norm. – supercat Apr 23 '15 at 22:22
8

No, because %x formats an unsigned int, and the type of the constant expression 1 is int, while the value of it is expressible as an unsigned int. The operation is not UB.

Jonathan Grynspan
  • 43,286
  • 8
  • 74
  • 104
  • 2
    It formats both. :) The variadic argument spec overrides the printf spec, and the former allows for the use of int where unsigned int is expected. – Jonathan Grynspan Jan 12 '11 at 00:25
  • 1
    Actually, "%x" takes an "unsigned int", not an "int", argument. R. is wondering if the various details he quotes from the standard means that this is, technically speaking, undefined behavior. – Michael Burr Jan 12 '11 at 00:27
  • @Jonathan - I agree with your reading of the standard. If that was in your answer, I would upvote you. – Chris Lutz Jan 12 '11 at 00:32
  • 5
    6.5.2.2 defines the behavior in general for variadic functions, but 7.19.6.1 turns around and says that unless the type matches the format specifier, the behavior is undefined. It seems like this paragraph should be omitted or fixed to mention the exception for signed/unsigned mismatch if that's the intent. – R.. GitHub STOP HELPING ICE Jan 12 '11 at 00:32
  • 1
    @R.. - I'm assuming that by "If any argument is not the correct type" they mean "If any argument is not the correct type based on the previously outlined rules for type punning." – Chris Lutz Jan 12 '11 at 00:34
  • Edited to clarify my justification and to correct the type %x strictly expects. – Jonathan Grynspan Jan 12 '11 at 00:35
  • 11
    Default argument promotion will not normally cause `int` arguments to be converted to `unsigned int` so that fact that 1 must be expressible as an `unsigned int` is irrelevant. If `printf` was guaranteed to use the `va_arg` macro then you would expect the exception in 7.12.1.1 to hold but this is not a requirement. The type of the argument after default argument promotion is still `int`, not `unsigned int` and (as others have said) 7.19.6.1 clearly states: "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined." – CB Bailey Jan 12 '11 at 08:36
  • True for most types, but for a signed integer type and its corresponding unsigned integer type where the value is representable by both, 6.5.2.2 allows it. (From a practical perspective, this is always true anyway, but from a standards perspective it appears to be explicitly defined.) – Jonathan Grynspan Jan 12 '11 at 13:10
  • @JonathanGrynspan 6.5.2.2 does not say anything of the sort. In fact, C11 6.5.2.2/7 explicitly says "The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter". – M.M Mar 09 '16 at 01:03
  • 3
    FWIW my view is that this code *should* be legal but the standard does not define it, and I consider the standard defective here. A blatant inconsistency can be seen by looking at the `l` modifier specification, which clearly defines that `"%lx"` may correspond to arguments `1L` and `-1L` ! – M.M Mar 09 '16 at 01:05
4

It is undefined behavior, for the same reason that re-interpreting a pointer to an integer type to complementary type of opposite signedness. This isn't allowed, unfortunately, in both directions because a valid representation in one may be a trap implementation in the other.

The only reason I see that from signed to unsigned re-interpretation there may be a trap representation is this perverted case of sign representation where the unsigned type just masks out the sign bit. Unfortunately such a thing is allowed as of 6.2.6.2 of the standard. On such an architecture all negative values of the signed type may be trap representations of the unsigned type.

In your example case this is even more weird, since having 1 a trap representation for the unsigned type is in turn not allowed. So to make it a "real" example, you'd have to ask your question with a -1.

I don't think that there is still any architecture for which people write C compilers that has these features, so definitively live would become more easy if a newer version of the standard could abolish this nasty case.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • 3
    I'm not convinced this is allowed by the standard. As far as I know, values representable in both signed and unsigned versions of the type are required to have the same representation. Note that the aliasing rules in "Representation of Types" explicitly allow access as a sign-mismatched type. – R.. GitHub STOP HELPING ICE Jan 12 '11 at 18:31
  • @R.. Just look it up in the standard. It explicitly states that the number of value bits of the signed type is *less or equal* to that number of the unsigned type. And in particular that a negative signed value may be a trap representation of the unsigned type is also allowed. And you are probably right for the aliasing rules. So *this* needs a defect report. – Jens Gustedt Jan 12 '11 at 18:59
  • I agree with what you just said. However, that doesn't contradict a requirement that positive values of the signed type must agree in representation with the same values for the unsigned type - a requirement which I believe is intended to be there and implied by other conditions even if not explicitly stated. – R.. GitHub STOP HELPING ICE Jan 12 '11 at 20:17
  • 1
    @R.. In fact it is explicitly stated that positive values as long as they fit in both types must have the same representation. I had already corrected my answer accordingly. – Jens Gustedt Jan 12 '11 at 21:24
0

I believe it's undefined. Functions with a variable-length arguments list don't have an implicit conversion when accepting arguments, so 1 won't be cast to unsigned int when being past to printf(), causing undefined behavior.

nalzok
  • 14,965
  • 21
  • 72
  • 139
0

TL;DR it is not UB.

As n. 'pronouns' m. pointed out in this answer, the C standard says that all non-negative values of a signed integer type have the exact same representation as the corresponding unsigned type, and therefore can be used interchangeable as long as the value is in the range of both types.

From the C99 standard 6.2.5 Types - Paragraph 9 and Footnote 31:

9 The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. 31)

31) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

The exact same text is in the C11 Standard in 6.2.5 Types - Paragraph 9 and Footnote 41.

-1

The authors of the Standard do not generally try to explicitly mandate behavior in every imaginable corner case, especially when there is an obvious correct behavior which is shared by 100% of all implementations, and there no reason to expect any implementation to do anything else. Despite the Standard's explicit requirement that signed and unsigned types have matching memory representations for values that fit in both, it would be theoretically possible for an implementation to pass them to variadic functions differently. The Standard doesn't forbid such behavior, but I see no evidence of the authors intentionally permitting it. Most likely, they simply didn't consider such a possibility since no implementation had ever (and so far as I know, has ever) worked that way.

It would probably be reasonable for a sanitizing implementation to squawk if code uses %x on a signed value, though a quality sanitizing implementation should also provide an option to silently accept such code. There's no reason for sane implementations to do anything other than either process the passed value as unsigned or squawk if it's used in a diagnostic/sanitizing mode. While the Standard might forbid an implementation from regarding as unreachable any code that uses %x on a signed value, anyone who thinks implementations should avail themselves of such freedom should be recognized as a moron.

Programmers who are targeting exclusively sane non-diagnostic implementations shouldn't need to worry about adding casts when outputting things like "uint8_t" values, but those whose code might be fed to moronic implementations might want to add such casts to prevent compilers from the "optimizations" such implementations might impose.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 1
    This answer reads like it was written without consideration of anything that's already been written/discussed on the topic. I'm not the downvoter but I'm not surprised someone did. For normal variadic functions written in C (as opposed to abstract ones just specified by the standard), the behavior **is** well-defined when passing a signed 1 to a function expecting an unsigned argument. The question is very specific to `printf` and family, which are not specified in terms of `va_arg`. – R.. GitHub STOP HELPING ICE Jun 17 '17 at 00:35
  • @R..: Perhaps I should adjust my answer to make it more printf-centric, but the main point is that the authors of the Standard would have had no reason to expect that implementations would do anything with positive signed values other than treat them the same as the corresponding unsigned values, and thus saw no reason to explicitly mandate such behavior for printf. If the authors of the C Standard wanted to void breaking existing code (which is their claim), they would have intended that Undefined Behavior be taken as an invitation for implementers to *exercise reasonable judgement*... – supercat Jun 17 '17 at 18:00
  • ...about how something should be processed, based upon a variety of factors. The only real question is whether all implementations should be relied upon to be the product of reasonable judgments. – supercat Jun 17 '17 at 18:06