2

I was trying to make a tiny hello world executable in Linux but the smallest I could get was a bit over 8k.

I have been following this article. In that article, the author is iteratively making smaller and smaller executables. At some point in it, it provides the following code for NASM:

; tiny.asm
BITS 32
GLOBAL _start
SECTION .text
_start:
    mov     eax, 1
    mov     ebx, 42  
    int     0x80

and this should be the commands to compile it:

nasm -f elf tiny.asm
gcc -Wall -s -nostdlib tiny.o

According to the author, the executable size at that point should be 372 bytes. However, for me it's more than 12KBytes. The only thing I did differently is adding the -m32 to gcc so it compiles as 32 bits.

I manged to get it down to a bit over 8KBytes by using ld for linking instead of gcc and then using strip.

ld -m elf_i386 -s tiny.o
strip -s a.out

But it's still to large compered to the result in the article. Am I missing anything? I guess the article is quite old since it's compiling 32 bits by default. Could it be that newer Linux versions require bigger executables?

EDIT: code for my hello world program:

SECTION .data
msg: db "hello",10
msgLen: equ $ - msg ; the $ sign means the current byte address. That means the address where the next byte would go

SECTION .text

global _start ; "global" means that the symbol can be accessed in other modules. In order to refer to a global symbol from another module, you must use the "extern" keyboard
_start:
    mov eax, 4 ; syscall: write
    mov ebx, 1 ; stdout
    mov ecx, msg
    mov edx, msgLen
    int 0x80 ; call!

    mov eax, 1 ; syscal: exit
    mov ebx, 0 ; return code
    int 0x80
tuket
  • 3,232
  • 1
  • 26
  • 41
  • 3
    Check this other Q/A: [what is segment 00 in my Linux executable program (64 bits)](https://stackoverflow.com/questions/65167620/what-is-segment-00-in-my-linux-executable-program-64-bits). It's probably what you are experiencing: alignment. – Margaret Bloom Dec 27 '20 at 22:12
  • 1
    @MargaretBloom Added `-m` to ld and now the executable size is down to 245 bytes :D However, my hello world program is still quite big(4KB) using the same commands, could it be because there is a `.data` section? (I've just added the code to the question) – tuket Dec 27 '20 at 22:26
  • 1
    [Why do my results different following along the tiny asm example?](https://stackoverflow.com/q/65461235) is an exact duplicate of the first part: replicating the tinyelf tutorial on modern GNU/Linux with Binutil's newer default linker script, etc. The 2nd part should be covered adequately, by the other duplicates. – Peter Cordes Dec 27 '20 at 23:42
  • @PeterCordes Thanks! What a coincidence that such a similar question was asked just yesterday. I tried using `-nmagic` but it's still more than 4KB – tuket Dec 28 '20 at 00:05
  • 1
    Your program with a `section .data` is more than 4k? Hmm, yeah I can reproduce that. It's of course mostly zeros, so maybe still padding to make `.data` separately mappable from .text, not showing up in each other's pages (see my "10x bigger" answer), even though the section alignment is only 16 and 4. (`readelf -S hello`). Put that read-only string data at the end of your `.text`, so no .data, does make it smaller, like 450 bytes from `ld -melf_i386 -n -o hello hello.o`. – Peter Cordes Dec 28 '20 at 00:31
  • @PeterCordes I didn't know you could have strings in the .text section. Indeed, the executable is only 276 bytes now! Thank you! – tuket Dec 28 '20 at 01:05
  • 1
    Everything is just bytes in memory. The assembler will just assemble source lines into bytes at the current position in the current section. e.g. `db 0x90` and `nop` are completely equivalent anywhere. The .text section gets linked into a place where it will be mapped read + exec (without write), so the only difference from `.rodata` is that it also has exec permission. (Older Binutils `ld` used to link `.rodata` into the same program segment as `.text`, but now it uses a separate program segment so `.rodata` can be mapped without exec permission; no ROP / Spectre gadgets possible there) – Peter Cordes Dec 28 '20 at 01:22

0 Answers0