4

In his FAQ, Bjarne Stroustrup says that when compiled with gcc -O2, the file size of a hello world using C and C++ are identical.

Reference: http://www.stroustrup.com/bs_faq.html#Hello-world

I decided to try this, here is the C version:

#include <stdio.h>

int main(int argc, char* argv[])
{
    printf("Hello world!\n");
    return 0;
}

And here is the C++ version

#include <iostream>

int main(int argc, char* argv[])
{
    std::cout << "Hello world!\n"; 
    return 0;
}

Here I compile, and the sizes are different:

r00t@wutdo:~/hello$ ls
hello.c  hello.cpp
r00t@wutdo:~/hello$ gcc -O2 hello.c -o c.out
r00t@wutdo:~/hello$ g++ -O2 hello.cpp -o cpp.out
r00t@wutdo:~/hello$ ls -l
total 32
-rwxr-xr-x 1 r00t r00t 8559 Sep  1 18:00 c.out
-rwxr-xr-x 1 r00t r00t 8938 Sep  1 18:01 cpp.out
-rw-r--r-- 1 r00t r00t   95 Sep  1 17:59 hello.c
-rw-r--r-- 1 r00t r00t  117 Sep  1 17:59 hello.cpp
r00t@wutdo:~/hello$ size c.out cpp.out
   text    data     bss     dec     hex filename
   1191     560       8    1759     6df c.out
   1865     608     280    2753     ac1 cpp.out

I replaced std::endl with \n and it made the binary smaller. I figured something this simple would be inlined, and am dissapointed it's not.

Also wow, the optimized assemblies have hundreds of lines of assembly output? I can write hello world with like 5 assembly instructions using sys_write, what's up with all the extra stuff? Why does C put some much extra on the stack to setup? I mean, like 50 bytes of assembly vs 8kb of C, why?

Devolus
  • 21,661
  • 13
  • 66
  • 113
Jack
  • 109
  • 1
  • 3

2 Answers2

7

You're looking at a mix of information that's easily misinterpreted. The 8559 and 8938 byte file sizes are largely meaningless since they're mostly headers with symbol names and other misc information for at least minimal debugging purposes. The somewhat meaningful numbers are the size(1) output you added later:

r00t@wutdo:~/hello$ size c.out cpp.out
   text    data     bss     dec     hex filename
   1191     560       8    1759     6df c.out
   1865     608     280    2753     ac1 cpp.out

You could get a more detailed breakdown by using the -A option to size, but in short, the differences here are fairly trivial.

What's more interesting is that Bjarne Stroustrup never mentioned whether he was talking about static or dynamic linking. In your case, both programs are dynamic-linked, so the size differences have nothing to do with the actual size cost of stdio or iostream; you're just measuring the cost of the calling code, or (more likely, based on the other comments/answer) the base overhead of exception-handling support for C++. Now, there is a common claim that a static-linked C++ iostream-based hello world can be even smaller than a printf-based one, since the compiler can see exactly which overloaded versions of operator<< are used and optimize out unneeded code (such as expensive floating point printing), whereas printf's use of format strings makes this difficult in the common case and impossible in general. However, I've never seen a C++ implementation where a static-linked iostream-based hello program could come anywhere near close to being as small as, much less smaller than, a printf-based one in C.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
5

I think he's treating the half kilobyte as a rounding error. Both are "9 kilobytes" and that's what you'll see in a typical file browser. They aren't exactly the same because, under the hood, the C and C++ libraries are quite different. If you're already familiar with your disassembler, you can see the details of the difference for yourself.

The "extra stuff" is for the sake of importing symbols from the standard library shlib, and handling C++ exceptions. Strangely enough, much of the GCC-compiled C executable is taken up by C++ exception handling tables. I've not figured out how to strip them using GCC.

endl is inlined, but it contains calls to print the \n character and flush the stream, which are not inlined. The difference in size is due to importing those from the standard library.

In truth, individual kilobytes seldom matter on any system with dynamically-loaded libraries. Self-contained code such as on an embedded system would need to include the standard library functionality it uses, and the C++ standard library tends to be heavier than its C counterpart — <iostream> vs. <stdio.h> in particular.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • _"I've not figured out how to strip them using GCC"_ `-fno-exceptions` usually prevents using exceptions. There's also a weak function to override in the `stdlibc++` that provides the terminate (`terminate_handler` or so) code for exception handling. – πάντα ῥεῖ Sep 02 '14 at 01:17
  • 1
    @πάνταῥεῖ Of course that's the first thing I tried :v) . Let me know if you find something that works. The issue isn't generating exceptions, it's providing support for calls to functions that use exceptions. – Potatoswatter Sep 02 '14 at 01:18
  • I would have to check my colleagues work, he did the trick omitting the standard uncatched exception handling code by overriding that mentioned weak function. I just don't remember the exact name for now. – πάντα ῥεῖ Sep 02 '14 at 01:20
  • @πάνταῥεῖ The cruft isn't executable code at all, it's the tables that tell the unwinding library how to interpret the stack. – Potatoswatter Sep 02 '14 at 01:22
  • If _the tables_ aren't referenced by any code, they won't be pulled from the linker, would they? – πάντα ῥεῖ Sep 02 '14 at 01:23
  • @πάνταῥεῖ That's not how exception handling works. The tables are used by `libunwind`. I suggest you compile something like this "hello world," or even something simpler with no library calls, and then check the compiled size and run it through a disassembler. – Potatoswatter Sep 02 '14 at 01:25
  • I'm pretty sure my colleague did that trick well. The challenge that was asked for was to use exceptions, but to remove any unnecessary overhead. And the main codesize bottleneck turned out to be that "`terminate_handler`" function. I'm not sure if this will help to reduce that code linked in via `stdlibc++`. – πάντα ῥεῖ Sep 02 '14 at 01:31