0

I'm following along with https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html, a guide on decreasing the size of ELF files. I find that my executable today, 20 years later, is ten times larger. I want to learn why.

The program under discussion is below. I've modified it slightly, to be 64-bit (having also tested under 32-bit mode, this adds no more than 40 bytes):

section .text
    bits 64
    global _start
_start:
    mov rax, 60
    mov rdi, 42
    syscall

This calls the syscall 'exit' (ABI number 60, currently) with the exit code 42. Why? Because that's what the original author thought was a very simple program.

Our build process:

nasm -f elf64 teensy.asm
ld -s teensy.o -o teensy
./teensy; echo $? # Echoes 42 as expected

However, it looks like file sizes have changed since the article was written:

$ wc -c a.out # 1999
368 a.out
$ wc -c teensy # 2022
4320 teensy

Why the difference? Looking at the output of objdump in 1999:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000007  08048074  08048074  00000074  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000000  0804907c  0804907c  0000007c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  0804907c  0804907c  0000007c  2**2
                  ALLOC

vs in 2022:

$ objdump -x teensy

teensy:     file format elf64-x86-64
teensy
architecture: i386:x86-64, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x0000000000401000

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
         filesz 0x00000000000000b0 memsz 0x00000000000000b0 flags r--
    LOAD off    0x0000000000001000 vaddr 0x0000000000401000 paddr 0x0000000000401000 align 2**12
         filesz 0x000000000000000c memsz 0x000000000000000c flags r-x

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         0000000c  0000000000401000  0000000000401000  00001000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
SYMBOL TABLE:
no symbols

So File off has changed, which may account for the difference. But why?

Zachary Vance
  • 752
  • 4
  • 18
  • Not sure when the default changed, but I can confirm -n ("new magic") or -N ("old magic") turn off page alignment and gives something in line with the original sizes. – Zachary Vance Sep 15 '22 at 04:08
  • 4
    As you see, in the 2022 version the `.text` section is on its own page, which allows the ELF header to be mapped separately with non-executable permissions. I might speculate that this is a security feature: avoid mapping anything but code as executable, to minimize the number of "gadgets" available to exploit code. – Nate Eldredge Sep 15 '22 at 04:11
  • @NateEldredge: Minimizing possible ROP / Spectre gadgets is what I've always assumed was the primary reason for changes like this which keep bytes out of executable pages when they don't need to be there. – Peter Cordes Sep 15 '22 at 05:51

0 Answers0