0

I have heard that when a compiler compiles code, what it does is create a file that contains instructions that a machine can execute. According to this video, a simple program like int main(){ int i; i = 3; } should, when compiled, produce a file that's only several bytes long. So why does clang compile this into a file that's several kilobytes long?

  • Read about binary formats like [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format). You can also peek into the binary with tools like `objdump` to see the real contents. – brokenfoot Sep 23 '19 at 03:26
  • There are certain things the C standard requires an implementation to set up before calling `main()`, such as standard input and output steams (`stdin`, `stdout`, etc), packing command line arguments into a form so `main()` can be called, etc. Those things involve extra actions - i.e. instructions to execute, data to access - on top of whatever `main()` does. And most implementations (compilers/linkers, etc) default to embedding code for all that in the executable without checking if it is actually needed. – Peter Sep 23 '19 at 05:10
  • 1
    Looking at http://timelessname.com/elfbin/ might be interesting. It shows an attempt to build a smaller x86 ELF HelloWorld program. – Fryz Sep 23 '19 at 08:57
  • Possible duplicate of [size of executable files?](https://stackoverflow.com/questions/5535188/size-of-executable-files) – gstukelj Sep 23 '19 at 09:44

1 Answers1

0

This is likely due to some #include statements that statically bind libraries with your executable, or a compiler and a linker including debugging information. Of course an executable also contains a lot of OS specific data/information which add up to the size, see this question for more detailed answers. If you're after a small size executable there's plenty of suggestions in the answers to this question.

EDIT: Reading more about it, the size comes down to C being a high-level language in the sense that it does not communicate directly with hardware, but rather talks with an operation system. Basically, main is not the entry point of your program and there's a lot that goes on before it is even called. I strongly recommend you reading through this blog post and its follow-up and foremost watching Matt Godbolt's insightful talk on the topic. These are all concerned mostly with gcc and GNU/Linux, but I think it's fair to assume that similar reasons apply to executable sizes on other operating systems as well.

gstukelj
  • 2,291
  • 1
  • 7
  • 20