42

Any C programmer who's been working for more than a week has encountered crashes that result from calling printf with more format specifiers than actual arguments, e.g.:

printf("Gonna %s and %s, %s!", "crash", "burn");

However, are there any similar bad things that can happen when you pass too many arguments to printf?

printf("Gonna %s and %s!", "crash", "burn", "dude");

My knowledge of x86/x64 assembly leads me to believe that this is harmless, though I'm not convinced that there's not some edge condition I'm missing, and I have no idea about other architectures. Is this condition guaranteed to be harmless, or is there a potentially crash-inducing pitfall here, too?

JSBձոգչ
  • 40,684
  • 18
  • 101
  • 169
  • 3
    Not an answer to your question, stacker's is correct, but for the crashes. `gcc` should give good warnings on that, so there is really no excuse for overlooking that one ;-) – Jens Gustedt Aug 26 '10 at 20:09
  • How could GCC give warnings for that? Consider that the format string doesn't necessarily have to be a constant string. It can be any `char *` – Nathan Fellman Aug 26 '10 at 20:24
  • 2
    GCC can give good warnings when it can know the format string at compile time. Since that represents a large swath of the rational use cases for `printf` and friends, those warnings are valuable and should be heeded. – RBerteig Aug 26 '10 at 21:00
  • gcc can also give warnings when the format string is not a string literal, if I remember correctly. This is to catch stupid things like `printf(mystring);`. – R.. GitHub STOP HELPING ICE Aug 26 '10 at 23:43

5 Answers5

42

Online C Draft Standard (n1256), section 7.19.6.1, paragraph 2:

The fprintf function writes output to the stream pointed to by stream, under control of the string pointed to by format that specifies how subsequent arguments are converted for output. If there are insufficient arguments for the format, the behavior is undefined. If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored. The fprintf function returns when the end of the format string is encountered.

Behavior for all the other *printf() functions is the same wrt excess arguments except for vprintf() (obviously).

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • 1
    Does this apply to user-written variadic functions? – user253751 Mar 17 '16 at 02:33
  • @immibis: The main reason it might not be is that some calling conventions have the callee pop args off the stack as it returns. This rule essentially requires the implementation to support the general case, or have a lot of special-purpose support for `printf`. Most systems with a callee-pops calling convention don't use it for variadic functions, for this reason. (And also that x86's `ret imm16` instruction requires the number of bytes to pop to be a compile-time constant). `printf` is pretty much worst-case for the callee's ability to detect the number of args: polymorphic, no sentinel – Peter Cordes May 25 '16 at 17:45
19

You probably know the prototype for the printf function as something like this

int printf(const char *format, ...);

A more complete version of that would actually be

int __cdecl printf(const char *format, ...);

The __cdecl defines the "calling convention" which, along with other things, describes how arguments are handled. In the this case it means that args are pushed onto the stack and that the stack is cleaned by the function making the call.

One alternative to _cdecl is __stdcall, there are others. With __stdcall the convention is that arguments are pushed onto the stack and cleaned by the function that is called. However, as far as I know, it isn't possible for a __stdcall function to accept a variable number of arguments. That makes sense since it wouldn't know how much stack to clean.

The long and the short of it is that in the case of __cdecl functions its safe to pass however many args you want, since the cleanup is performed in the code makeing the call. If you were to somehow pass too many arguments to a __stdcall function it result in a corruption of the stack. One example of where this could happen is if you had the wrong prototype.

More information on calling conventions can be found on Wikipedia here.

kingledion
  • 2,263
  • 3
  • 25
  • 39
torak
  • 5,684
  • 21
  • 25
  • 6
    __cdecl is a Win32ism, created by the fact that some old DOS compilers supported both C and pascal calling conventions. – ninjalj Aug 26 '10 at 21:46
  • 2
    @ninjalj, `__cdecl` is only supported by MS compilers, but the general note about calling conventions is valid for all OSs. – JSBձոգչ Aug 26 '10 at 21:48
  • @JSBangs: __cdecl was also supported by Borland compilers IIRC. Also, on most other OSes C uses only the C calling convention (right to left, caller cleans stack), possibly with variants for ISRs, compatibility with other compilers, and/or saving the args on registers (GCC's regparm). AFAIK, Win32 is the only platform where you can select a calling convention that does not support vararg functions. – ninjalj Aug 26 '10 at 21:56
  • 2
    @ninjalj: The 68000-based Macintosh operating system used "Pascal" calling convention (called function pops stack) for almost everything. A little ironic, actually, since on the 68000 that calling convention would require a sequence like: "mov.l (A7+),A0 / addq #4,A7 / jmp (A0)", whereas the C calling convention would allow use of the "RETURN" instruction. – supercat Aug 26 '10 at 22:05
  • @supercat: s/only platform/only still-in-active-use platform/ on my previous comment. – ninjalj Aug 26 '10 at 22:09
  • 11
    -1 for presenting MS-isms as if they were part of the C language. – R.. GitHub STOP HELPING ICE Aug 26 '10 at 23:46
  • 1
    according to [`__stdcall` documentation](https://learn.microsoft.com/vi-vn/cpp/cpp/stdcall?view=msvc-170), *the compiler makes `vararg` functions `__cdecl`*. – Giovanni Cerretani Oct 14 '22 at 08:49
4

All the arguments will be pushed on the stack and removed if the stack frame is removed. this behaviour is independend from a specific processor. (I only remember a mainframe which had no stack, designed in 70s) So, yes the second example wont't fail.

MartyIX
  • 27,828
  • 29
  • 136
  • 207
stacker
  • 68,052
  • 28
  • 140
  • 210
3

printf is designed to accept any number of arguments. printf then reads the format specifier (first argument), and pulls arguments from the argument list as needed. This is why too few arguments crash: the code simply starts using non-existent arguments, accessing memory that doesn't exist, or some other bad thing. But with too many arguments, the extra arguments will simply be ignored. The format specifier will use fewer arguments than have been passed in.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • To add to this, your compiler might also eliminate the extra parameters if it can detect that they are unused. You would have to look at the assembly output to tell if the extra params are really passed to `printf` or if they get optimized away. – bta Aug 26 '10 at 20:38
  • If the compiler knows for sure that it is calling something that uses printf's format language (and GCC has an attribute for that which can be used to decorate your own printf-like functions) then it is in principle safe to do this optimization. It would still have to act as if it had computed all the parameters in case any of the unused ones happened to have side effects. – RBerteig Aug 26 '10 at 21:04
-1

Comment: both gcc and clang produce warnings:

$ clang main.c 
main.c:4:29: warning: more '%' conversions than data arguments [-Wformat]
  printf("Gonna %s and %s, %s!", "crash", "burn");
                           ~^
main.c:5:47: warning: data argument not used by format string 
                      [-Wformat-extra-args]
  printf("Gonna %s and %s!", "crash", "burn", "dude");
         ~~~~~~~~~~~~~~~~~~                   ^
2 warnings generated.
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 1
    The problem is mostly when you generate your own format string, which isn't known at compile time. – meneldal Nov 22 '16 at 05:46