9

I'd like to make a very small compiled exe, which was written in C. But the smallest I can managed to get is 67KB. I'm using MinGW. I've tried not to use any header file, and this compiles with no error:

//no header
void main() {
 write(1, "Hello world!", 12);
}

GCC shows no error if I build and run this, but it's also 67KB.

alk
  • 69,737
  • 10
  • 105
  • 255
Ádám Bozzay
  • 529
  • 7
  • 19
  • This is size before or after usage of `strip` ? – fghj Nov 08 '15 at 18:36
  • 4
    `void main` is wrong. – melpomene Nov 08 '15 at 18:38
  • What do you find in the map file? – harper Nov 08 '15 at 18:39
  • 10
    You want to have look a this interesting article: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html This guy cuts down an `a.out` for `int main(void) { return 42; }` to 45 bytes :-) – alk Nov 08 '15 at 18:39
  • 1
    http://stackoverflow.com/questions/15314581/g-compiler-flag-to-minimize-binary-size and http://stackoverflow.com/questions/6771905/how-to-decrease-the-size-of-generated-binaries – Karoly Horvath Nov 08 '15 at 18:40
  • 1
    Is that the file size or the size of the loaded code? You link the standard library. And enable compiler warnings; you do not declare `write`. – too honest for this site Nov 08 '15 at 18:42
  • I'm a curious beginner, I don't know what's 'strip'. I managed to build in GCC to Notepad++, so I only get an executable. Thank's for the article, I'll read it. Edit: This is the file size of the executable. – Ádám Bozzay Nov 08 '15 at 18:44
  • find out where `gcc` binary in your system, at the same diretory you find `strip.exe` use it to remove unused symbols – fghj Nov 08 '15 at 18:48
  • The Compiler will always put some bootstrapping and error handling no into your executable – Marged Nov 08 '15 at 18:51
  • Try `int main(void) { return 0; }` – David Heffernan Nov 08 '15 at 18:53
  • Just tried -Os and strip, now the file size is 41KB. That's something, but not too much. – Ádám Bozzay Nov 08 '15 at 18:54
  • 1
    Try this `gcc -Os -fdata-sections -ffunction-sections -fipa-pta test.c -Wl,--gc-sections -Wl,-O1 -Wl,--as-needed -Wl,--strip-all` what size does it gives? – fghj Nov 08 '15 at 18:59
  • Try this `int main;` under gcc 5.1.1 compiles flawlessly ;) Size: 8456, GNU/Linux 64-bit – jaroslawj Nov 08 '15 at 19:35
  • I'm using Windows 7, MinGW. `int main;` `int main(void) { return 0; }` These are also 41KB with -Os -s. The long parameter list by user1034749 doesn't make it smaller. – Ádám Bozzay Nov 08 '15 at 19:50
  • The gcc / GNU ld developers don't put much time into curiosities... they focus their development time on correct and fast output. By all means submit a gcc patch with a new switch for small "hello world" if you want – M.M Nov 08 '15 at 20:48
  • Possible duplicate of [Reducing GCC target EXE code size?](http://stackoverflow.com/questions/8547233/reducing-gcc-target-exe-code-size) – ams Nov 10 '15 at 11:24

2 Answers2

8

I just tried this in x86_64 Linux, which probably isn't much different to MinGW at this level, although you never know.

Basically, the problem is that, even though nothing gets pulled in from the C library unless it's referenced, the CRT "startfiles" do reference a small selection of things, which in turn reference some other things, and "Hello world" ends up looking bad. This is not a problem worth fixing because all real programs would reference those core functions anyway.

The source for the start files is available, and quite small, and the compiler allows you to override the standard ones if you choose to, so optimizing them is not a massive deal. They're written in assembler code, but you can probably remove most of the extraneous garbage by simply deleting lines.

But, there's a hack for cutting the start-files out of the equation altogether:

#include <unistd.h>

void _start (void) {
  write(1,"Hello world!", 12);
  _exit(0);
}

Compile: gcc -nostartfiles t.c -s -static

Which works (by chance, see below), and gives me a file size of 1792 bytes.

For comparison, your original codes gives 738624 bytes, with the same compiler, which drops to 4400 bytes when I remove -static, but then that's cheating! (My code actually gets larger without -static, because the dynamic linker meta-data outweighs to code of write and _exit).

The by chance part, is that the program now has no stack pointer initialized. Likewise for all other global state the start-files usually take care of. As it happens, on x86_64 Linux, this isn't a fatal problem (just don't do it in production, right?) However, when I tried it with -m32 I get a segmentation fault inside write.

The problem can be fixed by adding your own initialization for that stuff, but then the code would no longer be as portable (it isn't absolutely portable already). Alternatively, call the write system call directly.

ams
  • 24,923
  • 4
  • 54
  • 75
  • Stack pointer not initialized? ELF files don't specify stack size or location AFAIK, so there must be a default stack set up by the system/loader. It must be something specific to gcc/glibc (something to do with extra initialization, red zone or alignment). My compiler produces executable 32-bit ELF files, which work on Ubuntu just fine without any special handling of the stack (other than accessing argc/argv; and brk works too) and invoking system calls directly (I have my own static library). – Alexey Frunze Dec 01 '15 at 11:51
  • Yes, the CRT start files are responsible for setting the stack pointer (at least, on some OS/arch combinations), zeroing the BSS (again, not necessary everywhere), calling C++ static initializers, setting up atexit handlers, and whatever else the OS doesn't do. If you bypass them as I described then you don't get any of that, and whether it works or not depends on the virtual memory layout/accessibility and maybe a little luck. – ams Dec 01 '15 at 12:01
  • Oh, and it's the CRT startfiles that call `main`. Mustn't forget to do that. ;-) – ams Dec 01 '15 at 12:02
  • If you don't use `-static` then the dynamic linker gets run first, so the rules are a bit different, but the CRT files still do some stuff and call `main`. – ams Dec 01 '15 at 12:05
3

I know this is old question, but I had same issue. Big size is also result of enabled by default RELRO and default max-page-size of 64K.

Hello world compiled with gcc -Wl,-z,max-page-size=0x1000 -s -Wl,-z,norelro main.c && sstrip -z a.out results in 2K binary.

Empty _start function file compiled with gcc -nostartfiles start.c -Wl,-z,max-page-size=0x1000,-z,norelro && sstrip -z a.out results in 164 bytes binary.

After some experimenting I made same program, but smaller:

#include <unistd.h>
#include <sys/syscall.h>
static const char str[] = "Hello world!";
void _start(){
syscall(SYS_write, 1, str, 12);
syscall(SYS_exit, 0);
}

With gcc -nostartfiles start.c -Wl,-z,max-page-size=0x1000,-z,norelro -static -Os && sstrip -z a.out resulting binary is 353 bytes on ARM. After adding -mthumb it becomes 349 bytes.

If you go full assembly, then you'll get 144 bytes executable. Same command line, but with .S file instead.

#include <sys/syscall.h>

.global _start
_start:
mov r7, $SYS_write                                 mov r0, $1
add r1, pc, $(hw - . - 8)
mov r2, $(end - hw)
svc #0
mov r7, $SYS_exit
svc #0

hw:
.ascii "Hello world!\n"
end:
.align 4
uis
  • 126
  • 1
  • 9