3

I have two C source files

foo1.c:

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    puts("hello world");
    return 0;
}

and foo2.c:

#include <stdlib.h>
#include <stdio.h>

void _start(void)
{
    puts("hello world");
    exit(0);
}

and I compile them like so on my i386 GNU/Linux platform:

$ #compile foo1
$ cc -o foo1 foo1.c
$ #compile foo2
$ cc -S foo2.c
$ as -o foo2.o foo2.s
$ ld -o foo2 -dynamic-linker /lib/i386-linux-gnu/ld-linux.so.2 -lc foo2.o
$ #notice that crt1.o and others are missing

The outputted executables do the same thing from a user's perspective.

$ ./foo1
hello world
$ ./foo2
hello world

But they are different:

$ wc -c foo1
5000
$ wc -c foo2
2208
$ objdump -d foo1 | wc -l
238
$ objdump -d foo2 | wc -l
35

Even when I enable gcc's -Os option to optimize size,

$ #compile foo1
$ gcc -o foo1 foo1.c -Os

it is not much smaller:

$ wc -c foo1
4908
$ objdump -d foo1 | wc -l
229

Is there any way to get GCC to optimize out the parts of crt1.o and friends which I suspect contribute to this bloated filesize without resorting to nonstandard code and weird (and likely harmful in some cases) compilation? My GCC's version string is "gcc (Debian 4.9.2-10) 4.9.2".

nebuch
  • 6,475
  • 4
  • 20
  • 39
  • Do you have a motivation for wanting to reduce the file size, or is this just purely out of curiosity? – templatetypedef Aug 09 '16 at 21:04
  • Take a look [here](http://stackoverflow.com/questions/6687630/how-to-remove-unused-c-c-symbols-with-gcc-and-ld) – Eugene Sh. Aug 09 '16 at 21:05
  • @EugeneSh. The symbols in the crt _are_ used, but they don't do anything useful in my example case. – nebuch Aug 09 '16 at 21:05
  • If they don't do anything useful - they are not used. – Eugene Sh. Aug 09 '16 at 21:06
  • And you can always compile without the standard startup code `-nostartfiles`. But then you will have the hassle to replace it. – Eugene Sh. Aug 09 '16 at 21:08
  • @EugeneSh. objdump reveals that at startup code to enable gprof profiling does run, but I will never be using gprof with this binary; in this case useless code is used – nebuch Aug 09 '16 at 21:13
  • 1
    If it can't be removed with the linker as unused, you have no choice but use a different startup code. – Eugene Sh. Aug 09 '16 at 21:18
  • Doesn't work with `gcc-4.8.real (Ubuntu 4.8.5-2ubuntu1~14.04.1) 4.8.5` it gives `bash: ./foo2: Accessing a corrupted shared library` I needed to exchange `/lib/i386-linux-gnu/ld-linux.so.2` in the linker command with `/lib64/ld-linux-x86-64.so.2` although the first one links to the second one. Just information if somebody wants to repeat it with an older version. – deamentiaemundi Aug 09 '16 at 21:21

1 Answers1

8

With gcc/clang you can use -nostartfiles, but the c library you are using may depend on its own _start() implementation for dynamic linking. Since you are on Linux, I would recommend using a static build of musl-libc.

Alternatively you could just implement the write() and exit() systemcalls and add the '\n' to your string to avoid the c library altogether by using _start() instead of main(). If you need access to argc, argv and envp, you will need some inline assembly to access the stack (Linux passes these on the stack for all ELF binaries regardless of the architecture).

technosaurus
  • 7,676
  • 1
  • 30
  • 52
  • re: args: [How Get arguments value using inline assembly in C without Glibc?](https://stackoverflow.com/q/50260855) – Peter Cordes Apr 15 '21 at 21:22