4

I tried to write the smallest possible x86_64 ELF hello world program by hand, but I receive a Segmentation fault when trying to run it.

gdb says: During startup program terminated with signal SIGSEGV, Segmentation fault.

Here is the hexdump:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 7800 0000 0000 0000  ..>.....x.......
00000020: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0100 0000 0000 0000  ....@.8.........
00000040: 0100 0000 0500 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 3100 0000 0000 0000 3100 0000 0000 0000  1.......1.......
00000070: 0200 0000 0000 0000 b801 0000 00bf 0100  ................
00000080: 0000 be9a 0000 00ba 0a00 0000 0f05 b83c  ..."...........<
00000090: 0000 00bf 0000 0000 0f05 4865 6c6c 6f2c  ..........Hello,
000000a0: 2057 6f72 6c64 210a 00                    World!..

Here is the output of readelf -a:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x78
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no section groups in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000031 0x0000000000000031  R E    0x2

There is no dynamic section in this file.

There are no relocations in this file.
No processor specific unwind information to decode

Dynamic symbol information is not available for displaying symbols.

No version information found in this file.

And here is the code:

0xb8 0x01 0x00 0x00 0x00 /* mov %rax, 1 ; sys_write */
0xbf 0x01 0x00 0x00 0x00 /* mov %rdi, 1 ; STDOUT */
0xbe 0x9a 0x00 0x00 0x00 /* mov %rsi, 0x9a ; address of string */
0xba 0x0a 0x00 0x00 0x00 /* mov %rdi, 15 ; size of string */
0x0f 0x05                /* syscall */

0xb8 0x3c 0x00 0x00 0x00 /* mov %rax, 60 ; sys_exit */
0xbf 0x00 0x00 0x00 0x00 /* mov %rdi, 0 ; exit status */
0x0f 0x05                /* syscall */

The "Hello, World!\n" string follows immediately afterwards. I have been using this MOV instruction. Playing around with the program header offset, alignment and virtual address fields did not yield anything. The manpage is a little confusing in this section. I also tried comparing this binary to one written in assembly, but I've found nothing useful.

Now to my question: Can you tell me what the mistake is and/or how I can debug this binary?

  • Don't you have to have a section, in order for anything to be loaded? The elf header isn't loaded into your process address space. – prl Jul 10 '22 at 18:59
  • If the address of the entry point is 78, then the address of the string is 9a, not 22. – prl Jul 10 '22 at 19:01
  • >Don't you have to have a section, in order for anything to be loaded? According to the [specification](https://refspecs.linuxfoundation.org/elf/elf.pdf) they are optional. >If the address of the entry point is 78, then the address of the string is 9a, not 22. Ah thanks, I forgot to modify this value after playing round with the program header. Doesn't solve the Segfault though. – DieDummheitInPerson Jul 10 '22 at 19:04
  • 1
    Perhaps have a look at this for inspiration? https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html – Mona the Monad Jul 10 '22 at 19:35
  • The full site has a lot of good info [Muppet Labs - The Teensy Files](https://www.muppetlabs.com/~breadbox/software/tiny/) – David C. Rankin Jul 10 '22 at 20:15
  • `mov %rax, 1` is a store to absolute address `1`, since `%` on a register name implies AT&T syntax. But actually, `b8 01 00 00 00` is `mov $1, %eax`, aka Intel-syntax `mov eax, 1`. Notice the lack of a REX prefix: it's definitely not 64-bit operand-size. It writes the full RAX via implicit zero-extension to 64-bit whenever a 32-bit register is written. It's an efficient way to set RAX=1 for speed, but not code-size: see [Tips for golfing in x86/x64 machine code](https://codegolf.stackexchange.com/a/132985) for 3-byte RAX=1 via `push 1` / `pop rax`. – Peter Cordes Jul 10 '22 at 22:24
  • Have you single-stepped it with a debugger? GDB on it, then `starti` to stop before the first user-space instruction / `layout asm` (/ `layout next` or prev to get a registers pane) / `stepi`. Or with `strace`? – Peter Cordes Jul 10 '22 at 22:30
  • oops, my bad, you were trying GDB, but it crashed during execve, after the point of no return (so it couldn't have execve return `-E...` in the old process). – Peter Cordes Jul 11 '22 at 04:07

2 Answers2

3

I tried to write the smallest possible x86_64 ELF hello world program by hand

You should provide a source for your program, so we can fix it.

gdb says: During startup program terminated with signal SIGSEGV

This is GDB telling you that it called fork/execve to create the target program, and expected the kernel to notify GDB that the program is now ready to be debugged. Instead, the kernel notified GDB that the program has died with SIGSEGV, without ever reaching its first instruction.

GDB didn't expect that. Why would this happen?

This happens when the kernel looks at your executable, and says "I can't create a running program out of that".

Why is that the case here? Because this LOAD segment:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000031 0x0000000000000031  R E    0x2

is asking the kernel to map 0x31 bytes from offset 0 in the file to virtual address 0. But the kernel (rightfully) refuses such nonsense request, and terminates the program with SIGSEGV before returning from execve.

You could probably avoid this by making the file ET_DYN instead of ET_EXEC -- that would change the meaning of your program header from "map this segment at 0" to "map this segment anywhere".

You could definitely avoid this by keeping the ET_EXEC, but changing the .p_vaddr and .p_paddr of the segment to something like 0x10000.

TL;DR: Your program and file headers must make sense to the kernel, or you'll never get off the ground.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • 1
    `ET_DYN` (a PIE executable) would require a RIP-relative `lea rsi, [rip + msg]`, not the current `mov esi, 0x9a`. Fortunately there's tons of room to save space on the other instructions (like `push 1` / `pop rdi` (3B) / `mov eax, edi` 2B / `lea edx, [rdi-1 + msg.len]` 3B), so the total payload can still fit into whatever space they found to tuck it in to. Unless all those zeroes in the machine code needed to be zeros in ELF header fields, like how https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html eventually has a version with the text segment overlapping with the ELF headers. – Peter Cordes Jul 11 '22 at 04:17
  • @Employed Russian Thanks for the advice, but changing `p_vaddr` and `p_paddr` to `0x1000`, `p_offset` to `0x78` (the start of the code) and the string address in the code to `0x22 0x00 0x01` still results in a segfault. As for the _source_... i put it in the question. I wrote this binary by hand. – DieDummheitInPerson Jul 11 '22 at 18:27
  • 1
    @DieDummheitInPerson You can't change `.p_vaddr = 0x1000` and `.p_offset = 0x78` -- they must be congruent module pagesize. Leave `.p_offset = 0` and try again. – Employed Russian Jul 11 '22 at 18:37
  • @Employed Russian Still gives a segfault. If you know a bit more about the "congruent modulo pagesize" or have better resources than the manpage, could you tell me? My best guess is, that it has something to do with this rule. EDIT: Now gdb outputs something different. I will try to debug this with the comments from Peter Cordes and update this comment if I find something out. – DieDummheitInPerson Jul 11 '22 at 18:41
  • @Employed Russian Thank you very much, it worked. Just had to sort out some issues with the code. – DieDummheitInPerson Jul 11 '22 at 19:00
1

The answer that I accepted did the trick. I just want to share the new hexdump of the binary here:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 7800 0100 0000 0000  ..>.....x.......
00000020: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0100 0000 0000 0000  ....@.8.........
00000040: 0100 0000 0500 0000 0000 0000 0000 0000  ................
00000050: 0000 0100 0000 0000 0000 0100 0000 0000  ................
00000060: 3100 0000 0000 0000 3100 0000 0000 0000  1.......1.......
00000070: 0200 0000 0000 0000 b801 0000 00bf 0100  ................
00000080: 0000 be9a 0001 00ba 0f00 0000 0f05 b83c  ...............<
00000090: 0000 00bf 0000 0000 0f05 4865 6c6c 6f2c  ..........Hello,
000000a0: 2057 6f72 6c64 210a 00                    World!..

readelf -a:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x10078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no section groups in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000010000 0x0000000000010000
                 0x0000000000000031 0x0000000000000031  R E    0x2

There is no dynamic section in this file.

There are no relocations in this file.
No processor specific unwind information to decode

Dynamic symbol information is not available for displaying symbols.

No version information found in this file.

The code:

0xb8, 0x01, 0x00, 0x00, 0x00, /* mov $0x1,%rax     ; sys_write */
0xbf, 0x01, 0x00, 0x00, 0x00, /* mov $0x1,%rdi     ; STDOUT */
0xbe, 0x9a, 0x00, 0x01, 0x00, /* mov $0x1009a,%rsi ; address of string */
0xba, 0x0f, 0x00, 0x00, 0x00, /* mov $0xf,%rdx     ; size of string*/
0x0f, 0x05,                   /* syscall */
0xb8, 0x3c, 0x00, 0x00, 0x00, /* mov $0x3c,%rax    ; sys_exit */
0xbf, 0x00, 0x00, 0x00, 0x00, /* mov $0x0,%edi     ; exit status */
0x0f, 0x05                    /* syscall */

As mentioned in the comments, this is not the smallest possible x86_64 ELF binary. The code could be improved and if you want to be crazy, you can put stuff in unused parts of the elf header. But in any case, I'm quite satisfied with a file size of 169 Bytes.

  • I'd suggest at least the standard optimization of `xor %edi, %edi` to zero EDI. Or omit that entirely and let your exit status be `1` - you still printed Hello World, nobody said the exit status had to be `0`. [What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/q/33666617). Also basic golf of `mov %eax, %edi` to take advantage of `__NR_write == STDOUT_FILENO` (probably not a coincidence, since Linux x86-64 also chose `__NR_read == STDIN_FILENO` (0). – Peter Cordes Jul 12 '22 at 01:25
  • Also, if you have source code for your binary, that would be a good thing to post for future readers that want to play around. (e.g. a `.s` file with `.byte` and `.quad` directives? Or did you just create it with a hex editor so you had nowhere to put comments about which field was what?) But anyway, seems like the key change here was putting the virtual address of your segment above [`mmap_min_addr`](https://wiki.debian.org/mmap_min_addr). – Peter Cordes Jul 12 '22 at 01:25