GCC: Empty program == 23202 bytes?

Question

test.c:

int main()
{
    return 0;
}

I haven't used any flags (I am a newb to gcc) , just the command:

gcc test.c

I have used the latest TDM build of GCC on win32. The resulting executable is almost 23KB, way too big for an empty program.

How can I reduce the size of the executable?

One suggestion: Do you get the same results using the minGW build of GCC? I'm not sure if that size is unusual or not, as I'm not very used to C++ either. — Macha, Aug 22 '09 at 13:00
Yeah, I know UPX, but the problem here is this: the compiler shouldn't generate ~23KB of junk for an empty program. — George, Aug 22 '09 at 13:05
Oh, I love C++ too :), too bad my company forces me to use C#. — George, Aug 22 '09 at 13:23
How exactly is 23KB "way too big"? Do you need to print out the executable or something? How exactly is this a problem? Since the program is basically empty, these 23KB are effectively a one-time cost. It doesn't mean that a slightly larger program will take up 46KB. Assuming your program grows to, say, 5MB, why would you even *care* about reducing the size by 23KB? — jalf, Aug 22 '09 at 16:15
1. Because I am interested in fine-tuning the compilation process, 2. Because I knew the normal overhead should not be ~23KB, 3. Because I can. — George, Aug 22 '09 at 16:22
Eh - if the answer is "Because I can", why are you asking the question how? Shouldn't it be "Because I want to" ? — MSalters, Aug 24 '09 at 08:17
I don't understand all the negative backlash to this question. Even if it's not relevant to your applications, it might be relevant to others, especially embedded systems developers. And even if you decide to do nothing about this padded code, it can still be instructive to understand why it's there. While it's for a different environment/toolset, I recommend this article: http://msdn.microsoft.com/en-us/magazine/cc301696.aspx — Adrian McCarthy, Aug 25 '09 at 23:04

score 39 · Accepted Answer · answered Aug 22 '09 at 13:48

39

Don't follow its suggestions, but for amusement sake, read this 'story' about making the smallest possible ELF binary.

answered Aug 22 '09 at 13:48

Phil Miller

36,389
13
67
90

1

Shit, this wasn't supposed to be taken seriously. It's now the most up-voted answer I've given! – Phil Miller Aug 23 '09 at 01:44
1

The article linked in http://stackoverflow.com/questions/553029/what-is-the-smallest-possible-windows-pe-executable is also interesting. – bk1e Aug 23 '09 at 16:13
2

@Novelocrat Yeah, I upvoted because the link you posted was very interesting, not because I think the OP should do anything like this. I *hope* most of the other upvotes were for the same reason. – Tyler McHenry Aug 24 '09 at 15:54
Finally, another answer of mine has outpaced this. Now I needn't be so ashamed. – Phil Miller Jun 17 '13 at 19:13

score 21 · Answer 2 · answered Aug 22 '09 at 13:08

21

How can I reduce its size?

Don't do it. You just wasting your time.
Use -s flag to strip symbols (gcc -s)

answered Aug 22 '09 at 13:08

maykeye

1,286
1
12
16

Kirill V. Lyadvinsky · Answer 3 · 2009-08-22T13:54:04.697

12

By default some standard libraries (e.g. C runtime) linked with your executable. Check out keys --nostdlib --nostartfiles --nodefaultlib for details. Link options described here.

For real program second option is to try optimization options, e.g. -Os (optimize for size).

edited Aug 22 '09 at 13:54

answered Aug 22 '09 at 13:08

Kirill V. Lyadvinsky

97,037
24
136
212

That's right. These keys I've used only for embedded systems. – Kirill V. Lyadvinsky Aug 22 '09 at 13:14
What do you recommend to start with? (I am new to GCC, but I have used C a lot in VisualCpp before) – George Aug 22 '09 at 13:17
If you're familiar with C it is appropriate to start from learning differences between gcc and VisualCpp. – Kirill V. Lyadvinsky Aug 22 '09 at 13:20
1

Exactly, Kristof. It's rather pointless. Learning how to make empty programs as small as possible doesn't necessarily translate into knowledge of how to make non-trivial programs small. All you're left with is a bunch of empty programs. Focus on getting something *worth* fine-tuning, first. – Rob Kennedy Aug 22 '09 at 16:33

score 12 · Answer 4 · answered Aug 22 '09 at 18:59

Give up. On x86 Linux, gcc 4.3.2 produces a 5K binary. But wait! That's with dynamic linking! The statically linked binary is over half a meg: 516K. Relax and learn to live with the bloat.

And they said Modula-3 would never go anywhere because of a 200K hello world binary!

In case you wonder what's going on, the Gnu C library is structured such as to include certain features whether your program depends on them or not. These features include such trivia as malloc and free, dlopen, some string processing, and a whole bucketload of stuff that appears to have to do with locales and internationalization, although I can't find any relevant man pages.

Creating small executables for programs that require minimum services is not a design goal for glibc. To be fair, it has also been not a design goal for every run-time system I've ever worked with (about half a dozen).

Wim ten Brink · Answer 5 · 2009-08-24T22:03:25.960

Actually, if your code does nothing, is it even fair that the compiler still creates an executable? ;-)

Well, on Windows any executable would still have a size, although it can be reasonable small. With the old MS-DOS system, a complete do-nothing application would just be a couple of bytes. (I think four bytes to use the 21h interrupt to close the program.) Then again, those application were loaded straight into memory. When the EXE format became more popular, things changed a bit. Now executables had additional information about the process itself, like the relocation of code and data segments plus some checksums and version information. The introduction of Windows added another header to the format, to tell MS-DOS that it couldn't execute the executable since it needed to run under Windows. And Windows would recognize it without problems. Of course, the executable format was also extended with resource information, like bitmaps, icons and dialog forms and much, much more.

A do-nothing executable would nowadays be between 4 and 8 kilobytes in size, depending on your compiler and every method you've used to reduce it's size. It would be at a size where UPX would actually result in bigger executables! Additional bytes in your executable might be added because you added certain libraries to your code. Especially libraries with initialized data or resources will add a considerable amount of bytes. Adding debug information also increases the size of the executable.

But while this all makes a nice exercise at reducing size, you could wonder if it's practical to just continue to worry about bloatedness of applications. Modern hard disks will divide files up in segments and for really large disks, the difference would be very small. However, the amount of trouble it would take to keep the size as small as possible will slow down development speed, unless you're an expert developer whom is used to these optimizations. These kinds of optimizations don't tend to improve performance and considering the average disk space of most systems, I don't see why it would be practical. (Still, I do optimize my own code in similar ways but then again, I am experienced with these optimizations.)

Interested in the EXE header? It's starts with the letters MZ, for "Mark Zbikowski". The first part is the old-style MS-DOS header for executables and is used as a stub to MS-DOS saying the program is not an MS-DOS executable. (In the binary, you can find the text 'This program cannot be run in DOS mode.' which is basically all it does: displaying that message. Next is the PE header, which Windows will recognise and use instead of the MS-DOS header. It starts with the letters PE for Portable Executable. After this second header there will be the executable itself, divided in several blocks of code and data. The header contains special reallocation tables which tells the OS where to load a specific block. And if you can keep this to a limit, the final executable can be smaller than 4 KB, but 90% would then be header information and no functionality.

As for a DOS application, a simple ret will do. That is, 1 byte. — Bruno Reis, Aug 22 '09 at 14:45
A ret would do, but the official rule was that you had to call the "Exit" interrupt. — Wim ten Brink, Aug 22 '09 at 15:29
I've built real Windows executables (PE format) that do useful things in <4KB, using VS2005. So a do-nothing executable certainly doesn't have to be 8KB. (Why? Autorun checker for a CD, don't start a large installer EXE if app is already installed) — MSalters, Aug 24 '09 at 08:22
The code does not do nothing - it returns zero to the environment. — Jonathan Leffler, Aug 25 '09 at 00:45

score 3 · Answer 6 · answered Aug 22 '09 at 13:48

3

I like the way the DJGPP FAQ addressed this many many years ago:

In general, judging code sizes by looking at the size of "Hello" programs is meaningless, because such programs consist mostly of the startup code. ... Most of the power of all these features goes wasted in "Hello" programs. There is no point in running all that code just to print a 15-byte string and exit.

answered Aug 22 '09 at 13:48

Sinan Ünür

116,958
15
196
339

2

The whole point of the empty program is to see the overhead. I'am simply interested how the compilation works, what ends up in a compiled binary aside from the code I put there. – George Aug 22 '09 at 14:45
2

Richard, that's not at all what you asked in your question. You asked how to get rid of the overhead. You didn't ask what the overhead consisted of. – Rob Kennedy Aug 22 '09 at 16:40

score 2 · Answer 7 · answered Aug 22 '09 at 13:46

2

What is the purpose of this exercise?

Even with as low a level language as C, there's still a lot of setup that has to happen before main can be called. Some of that setup is handled by the loader (which needs certain information), some is handled by the code that calls main. And then there's probably a little bit of library code that any normal program would have to have. At the least, there's probably references to the standard libraries, if they are in dlls.

Examining the binary size of the empty program is a worthless exercise in and of itself. It tells you nothing. If you want to learn something about code size, try writing non-empty (and preferably non-trivial) programs. Compare programs that use standard libraries with programs that do everything themselves.

If you really want to know what's going on in that binary (and why it's so big), then find out the executable format get a binary dump tool and take the thing apart.

answered Aug 22 '09 at 13:46

Michael Kohne

11,888
3
47
79

Given that you don't know the OP's motivations, that's simply not true. He might be interested in getting into embedded development, where code size matters a lot, for instance. – Phil Miller Aug 22 '09 at 14:06
3

Code size of the empty program is still completely irrelevant. And if he's into embedded programming where size of program matters, then anything he does fooling with a windows compiler is irrelevant. – Michael Kohne Aug 22 '09 at 14:57
Code size of an empty program is not irrelevant when 1. you code demos, 2. you are interested in how the compilation work, what ends up in the final executable, 3. and finally when you know that an empty program should not be ~23KB. There might be no obvious uses of something like this, but it doesn't make learning about the compiler flags irrelevant. – George Aug 22 '09 at 16:27
3

Richard, why do you code empty programs as demos? And if you don't know what's in the final executable, then how do you know it shouldn't be 23 K? And if you haven't learned why it was 23 K, then perhaps it's because you never asked. – Rob Kennedy Aug 22 '09 at 16:44

score 2 · Answer 8 · answered Aug 24 '09 at 22:28

What does 'size a.out' tell you about the size of the code, data, and bss segments? The majority of the code is likely to be the start up code (classically crt0.o on Unix machines) which is invoked by the o/s and does set up work (like sorting out command line arguments into argc, argv) before invoking main().

Gerhard · Answer 9 · 2009-08-24T08:08:41.983

Run strip on the binary to get rid of the symbols. With gcc version 3.4.4 (cygming special) I drop from 10k to 4K.

You can try linking a custom run time (The part that calls main) to setup your runtime environment. All programs use the same one to setup the runtime environment that comes with gcc but for your executable you don't need data or zero'ed memory. The means you could get rid of unused library functions like memset/memcpy and reduce CRT0 size. When looking for info on this look at GCC in embedded environment. Embedded developers are general the only people that use custom runtime environments.

The rest is overheads for the OS that loads the executable. You are not going to same much there unless you tune that by hand?

score 0 · Answer 10 · answered Aug 22 '09 at 14:10

Using GCC, compile your program using -Os rather than one of the other optimization flags (-O2 or -O3). This tells it to optimize for size rather than speed. Incidentally, it can sometimes make programs run faster than the speed optimizations would have, if some critical segment happens to fit more nicely. On the other hand, -O3 can actually induce code-size increases.

There might also be some linker flags telling it to leave out unused code from the final binary.

Unsurprising, in this case. There's not much code that GCC is actually touching here. — Phil Miller, Aug 23 '09 at 01:46

GCC: Empty program == 23202 bytes?

10 Answers10

Linked

Related