19

Let's say I have one line of printf() with a long string:

printf( "line 1\n"
"line 2\n"
"line 3\n"
"line 4\n"
"line 5\n"
"line 6\n"
"line 7\n"      
"line 8\n"
"line 9\n.. etc");  

What are the costs incurred by this style compared to having multiple printf()'s for each line?
Would there be a possible stack overflow if the string is too long?

haccks
  • 104,019
  • 25
  • 176
  • 264
eXPerience
  • 259
  • 4
  • 6
  • 3
    Your C syntax is invalid. – Erich Kitzmueller Aug 11 '16 at 13:00
  • 27
    @ammoQ; Seems valid to me. – haccks Aug 11 '16 at 13:01
  • 1
    @ammoQ It is valid. – LPs Aug 11 '16 at 13:02
  • 6
    There would be a difference of function call overhead (calling a function multiple times). Unlikely to get stack overflow .... a string literal has static storage duration, and only the address of the first character will be passed to `printf()` for each call. Do whatever makes your code easier to understand for a human. – Peter Aug 11 '16 at 13:03
  • 3
    Stack overflow is not the matter here, because string literals are `const char` and chars of strings are pushed into "read-only-memory" section: static storage. – LPs Aug 11 '16 at 13:04
  • 1
    [C String literals: Where do they go?](http://stackoverflow.com/questions/2589949/c-string-literals-where-do-they-go) – Ivan Aksamentov - Drop Aug 11 '16 at 13:04
  • 1
    It is more a matter of style than anything else... `printf` takes its string arguments as `const char *` so no copy is involved at that moment. Choose the style that will be more readable, or more coherent with you whole program(s). – Serge Ballesta Aug 11 '16 at 13:04
  • @LPs - string literals are const array of char, not pointer to char. – Peter Aug 11 '16 at 13:05
  • @Peter Yes, typo. Edited. Tx. – LPs Aug 11 '16 at 13:05
  • 2
    Calling *printf* many times has the overhead of calling a function many times. That's it. The length of the string doesn't change the whole complexity. – Déjà vu Aug 11 '16 at 13:07
  • Thank you for the explanations and links ! – eXPerience Aug 11 '16 at 13:13
  • 7
    Mind your implementation limits. The [C11 Standard requires implementations to accept string literals of up to 4095 characters](http://port70.net/~nsz/c/c11/n1570.html#5.2.4.1). Any more than that and you're into implementation-defined territory. – pmg Aug 11 '16 at 13:54
  • @Peter: string literals have type `char[N]` with `N` large enough for all the characters including the `'\0'` terminator. They are not `const` though they are *read-only*. – pmg Aug 11 '16 at 13:56
  • @pmg; True. It should be in transnational limit. Thanks for pointing it out. – haccks Aug 11 '16 at 14:02
  • 1
    Extending pmg's comment, the limits used to be much, much lower. Don't recall exctly what they were and it is merely historical curiosity but it seems that you used to run into problems around 400 characters, so multiple calls were necessary. – William Pursell Aug 11 '16 at 14:04
  • Better to use `fputs()` than `printf()` to cope with strings containing a `'%'`. – chux - Reinstate Monica Aug 11 '16 at 14:49
  • @WilliamPursell: for C89 I believe the limit was 509 characters (look in [http://flash-gordon.me.uk/ansi.c.txt](http://flash-gordon.me.uk/ansi.c.txt)) – pmg Aug 11 '16 at 14:57
  • I stand corrected :/ – Erich Kitzmueller Aug 11 '16 at 15:32
  • In addition to what others said: if you have one `printf` call, the whole string will be printed at once, otherwise, the output of your program may easily become a mess if you're running multiple threads that do output and don't use mutexes. – ForceBru Aug 11 '16 at 15:54
  • 1
    Code like this will be more readable if you align all the string literals under the first one. – zwol Aug 11 '16 at 16:49
  • 1
    @ForceBru that is not likely to be an issue. The data will be internally buffered, and whether multiple printfs are called or one, each will (most likely) use exactly the same number of writes. – William Pursell Aug 11 '16 at 19:05

5 Answers5

16

what are the costs incurred by this style compared to having multiple printf()'s for each line ?

Multiple printf will result in multiple function calls and that's the only overhead.

Would there be a possible stack overflow if the string is too long?

No stack overflow in this case. String literals are generally stored in read only memory, not in stack memory. When a string is passed to printf then only a pointer to its first element is copied to the stack.

Compiler will treat this multi line string "line 1\n"

"line 2\n"
"line 3\n"
"line 4\n"
"line 5\n"
"line 6\n"
"line 7\n"      
"line 8\n"
"line 9\n.. etc"  

as single string

"line 1\nline 2\nline 3\nline 4\nline 5\nline 6\nline 7\nline 8\nline 9\n.. etc"  

and this will be stored in read only section of the memory.

But note that (pointed by pmg in a comment) C11 standard section 5.2.4.1 Translation limits says that

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits18):
[...]
- 4095 characters in a string literal (after concatenation)
[...]

Community
  • 1
  • 1
haccks
  • 104,019
  • 25
  • 176
  • 264
10

C concatenates string literals if they are separated by nothing or by whitespace. So below

printf( "line 1\n"
"line 2\n"
"line 3\n"
"line 4\n"
"line 5\n"
"line 6\n"
"line 7\n"      
"line 8\n"
"line 9\n.. etc"); 

is perfectly fine and stands out in the readability point of view. Also a single printf call unarguably has lesser overhead than 9 printf calls.

sjsam
  • 21,411
  • 5
  • 55
  • 102
9

printf is a slow function if you are only outputting constant strings, because printf has to scan each and every character for a format specifier (%). Functions like puts are significantly faster for long strings because they can basically just memcpy the input string into the output I/O buffer.

Many modern compilers (GCC, Clang, probably others) have an optimization that automatically converts printf into puts if the input string is a constant string with no format specifiers that ends with a newline. So, for example, compiling the following code:

printf("line 1\n");
printf("line 2\n");
printf("line 3"); /* no newline */

results in the following assembly (Clang 703.0.31, cc test.c -O2 -S):

...
leaq    L_str(%rip), %rdi
callq   _puts
leaq    L_str.3(%rip), %rdi
callq   _puts
leaq    L_.str.2(%rip), %rdi
xorl    %eax, %eax
callq   _printf
...

in other words, puts("line 1"); puts("line 2"); printf("line 3");.

If your long printf string does not end with a newline, then your performance could be significantly worse than if you made a bunch of printf calls with newline-terminated strings, simply because of this optimization. To demonstrate, consider the following program:

#include <stdio.h>

#define S "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
#define L S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S
/* L is a constant string of 4000 'a's */

int main() {
    int i;
    for(i=0; i<1000000; i++) {
#ifdef SPLIT
        printf(L "\n");
        printf(S);
#else
        printf(L "\n" S);
#endif
    }
}

If SPLIT is not defined (producing a single printf with no terminating newline), the timing looks like this:

[08/11 11:47:23] /tmp$ cc test.c -O2 -o test 
[08/11 11:47:28] /tmp$ time ./test > /dev/null

real    0m2.203s
user    0m2.151s
sys 0m0.033s

If SPLIT is defined (producing two printfs, one with a terminating newline, the other without), the timing looks like this:

[08/11 11:48:05] /tmp$ time ./test > /dev/null

real    0m0.470s
user    0m0.435s
sys 0m0.026s

So you can see, in this case splitting the printf into two parts actually produces a 4x speedup. Of course, this is an extreme case, but it illustrates how printf may be variably optimized depending on the input. (Note that using fwrite is even faster - 0.197s - so you should consider using that if you really want speed!).

tl;dr: if you are printing only large, constant strings, avoid printf entirely and use a faster function like puts or fwrite.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
4

A printf without format modifiers is silently replaced (aka. optimized) to a puts call. This is already a speedup. You don't really want to lose that on calling printf/puts multiple times.

GCC has printf (among others) as a builtin, so it can optimize the calls during compile time.

See:

Koshinae
  • 2,240
  • 2
  • 30
  • 40
  • Do you have any source on this? I'd like to read up on more of it. Thanks. – eXPerience Aug 11 '16 at 13:55
  • I found out about this while tracing a binary, and wondered why does it call puts when I specifically printf'd the info. – Koshinae Aug 11 '16 at 14:11
  • 2
    This is not true. It may be true for a specific _compiler_. gcc for arm will for example replace all operations with the corresponding arm instructions - of course, that is completely irrelevant and merely an implementation detail, just like printf optimization. – pipe Aug 11 '16 at 14:48
  • 1
    This is a compiler specific optimization in almost all compilers. The OP asked about this. – Koshinae Aug 11 '16 at 14:50
1

Each additional printf (or puts if your compiler optimizes it that way) will incur the system specific function call overhead each time, though there's a good probability that optimization will combine them anyhow.

I have yet to see a printf implementation that was a leaf function, so expect additional function call overheads for something like vfprintf and it's callees.

Then you'll likely have some sort of system call overheads for each write. Since printf uses stdout, which is buffered, some of these (really costly) context switches could normally be avoided... except all of the examples above end with new lines. Most of your cost will probably be here.

If you are really worried about cost in your main thread, move this kind of stuff to a separate thread.

technosaurus
  • 7,676
  • 1
  • 30
  • 52
  • Compilers do not optimize this out, don't know why this assumption would be made https://godbolt.org/z/WKNf0X – ericcurtin Oct 21 '19 at 12:31