1

I'm trying to see the disassembled binary of a simple C program in gdb.

C program :

int main(){
        int i = 2;
        if (i == 0){
                printf("YES, it's 0!\n");
        }else{
                printf("NO");
        }
        return 0;
}

The disassembled instructions :

   0x0000000100401080 <+0>:     push   rbp
   0x0000000100401081 <+1>:     mov    rbp,rsp
   0x0000000100401084 <+4>:     sub    rsp,0x30
   0x0000000100401088 <+8>:     call   0x1004010e0 <__main>
   0x000000010040108d <+13>:    mov    DWORD PTR [rbp-0x4],0x2
   0x0000000100401094 <+20>:    cmp    DWORD PTR [rbp-0x4],0x0
   0x0000000100401098 <+24>:    jne    0x1004010ab <main+43>
   0x000000010040109a <+26>:    lea    rax,[rip+0x1f5f]        # 0x100403000
   0x00000001004010a1 <+33>:    mov    rcx,rax
   0x00000001004010a4 <+36>:    call   0x100401100 <puts>
   0x00000001004010a9 <+41>:    jmp    0x1004010ba <main+58>
   0x00000001004010ab <+43>:    lea    rax,[rip+0x1f5b]        # 0x10040300d
   0x00000001004010b2 <+50>:    mov    rcx,rax
   0x00000001004010b5 <+53>:    call   0x1004010f0 <printf>
   0x00000001004010ba <+58>:    mov    eax,0x0
   0x00000001004010bf <+63>:    add    rsp,0x30
   0x00000001004010c3 <+67>:    pop    rbp
   0x00000001004010c4 <+68>:    ret
   0x00000001004010c5 <+69>:    nop

And I suppose the instruction,

0x00000001004010a4 <+36>:    call   0x100401100 <puts>

points to

printf("YES, it's 0!\n");

Now let us assume it is, then my doubt is why <push> is called here , but <printf> is called at 0x00000001004010b5 <+53>: call 0x1004010f0 <printf> ?

  • The compiler have decided that it will be more optimal to call `puts` for longet string and `printf` for shorter one. Because it probably knows something of their internal implementations. – Eugene Sh. Feb 18 '22 at 15:12
  • 3
    `printf` is a lot more expensive to just print a non-formatted string than `puts`, the compiler can figure out what you intend and then AS-IF takes over and the compiler is allowed to substitute. Note: `puts` can be subbed for the first string because it has a carriage return, this is part of the behavior of `puts` and thus it's possible to make the switch. The second string doesn't have that and can't be swapped because it would be a violation of AS-IF – Mgetz Feb 18 '22 at 15:13
  • 1
    It could be swapped for `fputs("NO", stdout)` but then the compiler would have to know what the macro expansion of `stdout` is, and the part of the compiler that does this kind of optimization is too far away from the preprocessor, even if they're in the same executable. – zwol Feb 18 '22 at 17:05
  • @adenosinetp10: the question has been closed as a duplicate, but you can still accept one of the answers by clicking on the grey checkmark below its score. – chqrlie Feb 18 '22 at 19:03

2 Answers2

3

Using the semantics defined in the C Standard, printf("YES, it's 0!\n") produces the same output as puts("YES, it's 0!"), which may be more efficient as the string does not need to be analysed for replacements.

Since the return value is not used, the compiler can replace the printf call with the equivalent call to puts.

This type of optimisation was likely introduced as a way to reduce the executable size for the classic K&R program hello.c. Replacing the printf with puts avoids linking the printf code which is substantially larger than that of puts. In your case, this optimisation is counter productive as both puts and printf are linked, but modern systems use dynamic linking, so it is no longer meaningful to try and reduce executable size this way.

You can play with compiler settings on this Godbolt compiler explorer page to observe compiler behavior:

  • even with -O0, gcc performs the printf / puts substitution, but clang does not and both compilers generate code for both calls, not optimizing the test if (i == 0), which is OK with optimisations disabled. I suspect the gcc team could not resist biassing size benchmarks even with optimisations disabled.

  • with -O1 and beyond, both compilers only generate code for the else branch, calling printf.

  • if you change the second string to just "N", printf is converted to a call to putchar, yet another optimisation.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    That's a really nice answer @chqrlie – Darth-CodeX Feb 18 '22 at 15:55
  • It's worth noting that with optimization turned on only `printf` is linked because the other branch is unreachable. – Mgetz Feb 18 '22 at 16:32
  • @Mgetz: good point. I amended the answer with more information in this direction. – chqrlie Feb 18 '22 at 18:58
  • @chqrlie it's also worth noting that the linking and size complexity is far overshadowed by the parsing complexity of `printf`. `puts` is literally orders of magnitude faster in many cases because it doesn't allocate and doesn't have worse than linear complexity. Whereas `printf` should be assumed to do all of those things even in vanilla cases like this because of stdio buffering. So Honestly the size argument falls flat, `puts` is just faster. – Mgetz Feb 18 '22 at 19:49
  • Compare [`vprintf`](https://elixir.bootlin.com/glibc/glibc-2.35.9000/source/stdio-common/vfprintf-internal.c#L1179) to [`puts`](https://elixir.bootlin.com/glibc/glibc-2.35.9000/source/libio/ioputs.c), they aren't even in the same planet in terms of complexity. – Mgetz Feb 18 '22 at 19:49
  • `i == 0` isn't "obviously" false if you just consider that statement in isolation, which is [what `-O0` implies: it gives consistent debugging](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point) even if you single-step and modify `i` between `int i=2;` and `if(i == 0)`. Or even if you jump to a different source line within the same function. `gcc -O0` does do dead-code elimination for `if ( 0 == 1 )` or whatever. (Last I checked, MSVC doesn't, and actually compares immediates at run-time.) – Peter Cordes Feb 18 '22 at 19:52
  • @PeterCordes: you are correct, I shall amend my answer. I was surprised to see that `-O0` does not optimize the `if (i == 0)` test, which is OK with optimisations disabled, but does convert the `printf` to `puts`. I suspect an attempt at biasing size benchmarks even with optimisations disabled. – chqrlie Feb 18 '22 at 20:52
  • GCC `-O0` still does some optimizations within statements, like still optimizing `x /= 10` to a multiplicative inverse (unlike at `-Os`). It's not like `-O3` within statements, though, for example asm peepholes like xor-zeroing aren't done until `-O2` enables `-fpeephole2`. But the key point here is that compilers absolutely can't optimize across statements at `-O0`, but GCC still does a fair amount within that limitation. – Peter Cordes Feb 18 '22 at 20:57
  • It's more like, GCC always transforms code through its internal representations, and when there's no need to disable cheap but profitable local optimizations, it doesn't. It still scans format strings for warning purposes at `-O0`. See also [Disable all optimization options in GCC](https://stackoverflow.com/a/33284629). If you wanted to disable this, you could use `-fno-builtin-printf` I think, but since you didn't it treats printf as a builtin with all its normal handling (warnings and optimization). – Peter Cordes Feb 18 '22 at 21:00
1

It's an optimization.

Calling printf with a format string that has no format specifiers and a trailing newline is equivalent to calling puts with the same string with the trailing newline removed.

Since printf has a lot of logic for handling format specifiers but puts just writes the string given, the latter will be faster. So in the case of the first call to printf the compiler sees this equivalence and makes the appropriate substitution.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • Although, "A lot of logic" probably isn't all that much! `puts` has to examine the passed string a character at a time, looking for the `\0`. `printf` has to additionally check each character to see if it's a `%`, but if none are, it has no further "extra" work to do. – Steve Summit Feb 18 '22 at 15:48
  • 1
    @SteveSummit: stdio functions also have to scan for newline if the stream is line-buffered. glibc's implementation is rather inefficient for both. But `printf` definitely has more overhead, with `printf` calling `vfprintf` under the hood so it has to build a va_arg object and call another function, at least in glibc's implementation. And if puts didn't have to scan for a `\n`, it could just `strncpy` into the stdio buffer instead of inefficiently looping 1 char at a time with size checks at each character. Or `strlen`, either way taking advantage of glibc's SIMD implementations. – Peter Cordes Feb 18 '22 at 19:59