6

I've been learning assembly language by disassembling some C code. When I disassemble this basic C code with GDB:

#include <stdio.h>
void main(void) {
    printf("Hello World\n");
}

Among assembly code, it gives this line:

0x08048424 <+25>:   call   0x80482e0 <puts@plt>

However, when I disassemble below code which has an integer in printf function:

#include <stdio.h>
void main(void) {
    int a = 1;
    printf("Hello Word %d\n", a);
}

It gives this line:

0x0804842e <+35>:   call   0x80482e0 <printf@plt>

What is the difference between printf@plt and puts@plt?

Why disassembler does not recognize printf function without an integer parameter?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    The names? How they work? What they do? What is your problem? What have you found out by your search on the web and here or in the documentation of your compiler? How is that even related to gdb? And why should the disassembly show the symbol for `printf` if apparently `puts` is called? – too honest for this site Aug 17 '16 at 22:09
  • The whole point of standardized behaviour is that you can predict your your program will... behave. Compilers can do whatever they want in order to make the program behave as you expect. – Kerrek SB Aug 17 '16 at 22:36
  • near dup: [Compiler changes printf to puts](//stackoverflow.com/a/60080136) – Peter Cordes Feb 05 '20 at 17:28

3 Answers3

14

In GCC printf and puts are built-in functions. Which means that the compiler has full knowledge of their semantics. In such cases the compiler is free to replace a call to one function with an equivalent call to another, if it thinks that it will produce better (faster and/or more compact) code.

puts is a generally more efficient function since it does not have to parse and interpret a format string.

This is exactly what happened in your case. Your first call to printf does not really need any printf-specific features. The format string you supplied to printf is trivial: it has no conversion specifiers in it. The compiler thought that your first call to printf is better served with an equivalent call to puts.

Meanwhile, your second call to printf makes non-trivial use of printf format string, i.e. it relies on printf-specific features.

(Some rather thorough research of this specific matter from 2005: http://www.ciselant.de/projects/gcc_printf/gcc_printf.html)

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Well, they are not really "built-in". gcc just knows about their semantics. And that only if compiling for a hosted environment. – too honest for this site Aug 17 '16 at 22:11
  • @Olaf: "Built-in" is probably not a strictly defined term, but these function have been officially referred to as "built-in" for quite a while already: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html – AnT stands with Russia Aug 17 '16 at 22:13
  • @KeithThompson: Actually `printf` **is** in the list (do a search), `puts` too. – too honest for this site Aug 17 '16 at 22:18
  • @AnT: Ok, they state it like that. I find this confusing, though comparing with the gcc-special builtins. – too honest for this site Aug 17 '16 at 22:20
  • @Olaf: Ok, my mistake. I looked at the bulleted list starting with `__builtin_alloca` rather than searching for the name. But it seems odd to call it "builti-n" if the compiler just generates a call. I suppose it means that its *declaration*, not its *definition* is built-in, so the compiler can optimize calls. – Keith Thompson Aug 17 '16 at 23:47
  • @AnT: See my comment above (can't tag more than one person). – Keith Thompson Aug 17 '16 at 23:47
  • @KeithThompson: That was reason for my initial comment. "Recognised functions" might be a better term. But I somewhat doubt they are really interesetd in changing the documentation here. *sigh* – too honest for this site Aug 18 '16 at 19:56
  • I'm deleting my previous incorrect comment. `printf` and `puts` *are* in the list of "built-in" functions, but it seems that they're "built-in" only in the sense that the compiler recognizes them and treats them specially, not that the compiler will implement them via something other than a `call` instruction. – Keith Thompson Aug 18 '16 at 22:05
4

I don't know about the @plt part, but printf and puts are simply two different standard library functions. printf takes a format string and zero or more other parameters, possibly of different types. puts takes just a string and prints it, followed by a newline. Consult any C reference for more information, or type

man 3 printf
man 3 puts

assuming you're on a Unix-like system with man pages installed. (man printf without the 3 will show you the printf command; you want the printf function.)

Your compiler is able to optimize the call

printf("Hello, world\n");

to the equivalent of:

puts("Hello, world");

because it knows what both functions do, so it can determine that they do exactly the same thing.

It can't optimize

printf("Hello Word %d\n", a);

because the value of a is unknown at compile time, so it doesn't print a fixed string. (It might figure it out at higher optimization levels by observing that a is never modified after its initialization).

The disassembler is merely showing you the code generated by the compiler.

(Incidentally, void main(void) is incorrect; use int main(void).)

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
2

The puts and printf functions appear to have the same address because you are looking at stubs, and not the real functions. These stubs load an address from the procedure link table (what the @plt suffix refers to) and then call it.

I'm disassembling a program, and see that it has stubs for both printf and puts:

08048370 <printf@plt>:
 8048370:       ff 25 04 a0 04 08       jmp    *0x804a004
 8048376:       68 08 00 00 00          push   $0x8
 804837b:       e9 d0 ff ff ff          jmp    8048350 <_init+0x3c>

08048380 <puts@plt>:
 8048380:       ff 25 08 a0 04 08       jmp    *0x804a008
 8048386:       68 10 00 00 00          push   $0x10
 804838b:       e9 c0 ff ff ff          jmp    8048350 <_init+0x3c>

As you can see, the real functions are somewhere else, and these stubs are only generated for the functions that your program actually uses. If you have a program which calls only one function and then change it from printf to puts, it's not suprising that the one and only stub is at the same address. The program I just disassembled calls both printf and puts, and so has stubs for both, and consequently they have different addresses.

Kaz
  • 55,781
  • 9
  • 100
  • 149