1

I'm reading this page https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html

This is one of the example

; tiny.asm
BITS 32
GLOBAL _start
SECTION .text
_start:
                mov     eax, 1
                mov     ebx, 42  
                int     0x80

Here we go:

$ nasm -f elf tiny.asm
$ gcc -Wall -s -nostdlib tiny.o
$ ./a.out ; echo $?
42

Ta-da! And the size?

$ wc -c a.out
    372 a.out

However I don't get the same results. I tried nasm -f elf64 and then tried -m32 on gcc (then again on clang). No matter what I try I can not get it to be the tiny size. I'm on arch linux

$ cat tiny.asm 
; tiny.asm
BITS 32
GLOBAL _start
SECTION .text
_start:
                mov     eax, 1
                mov     ebx, 42  
                int     0x80

[eric@eric test]$ gcc -Wall -s -nostdlib -m32 tiny.o
[eric@eric test]$ stat ./a.out 
File: ./a.out
Size: 12780         Blocks: 32         IO Block: 4096   regular file
Device: 2eh/46d Inode: 1279        Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1000/     eric)   Gid: ( 1000/     eric)
Access: 2020-12-26 17:19:19.216294869 -0500
Modify: 2020-12-26 17:19:19.216294869 -0500
Change: 2020-12-26 17:19:19.216294869 -0500
Birth: -
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Eric Stotch
  • 141
  • 4
  • 19
  • Post the output of `objdump -d a.out` . But I suspect it has to do with different pages being added to a.out that may not have always been produced with different tools or on different distros. Some things that it likely are - unwind tables in an .eh_frame section, build notes, comments as well as a change in alignment. – Michael Petch Dec 26 '20 at 22:27

1 Answers1

4

-static is not the default even with -nostdlib when GCC is configured to make PIEs by default. Use gcc -m32 -static -nostdlib to get the historical behaviour. (-static implies -no-pie). See What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd? for more.

Also, you may need to disable alignment of other sections with gcc -Wl,--nmagic or using a custom linker script, and maybe disable extra sections of metadata that GCC adds. Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?

You probably don't have a .eh_frame section if you're not linking any compiler-generated (from C) .o files. But if you were, you can disable that with gcc -fno-asynchronous-unwind-tables. (See also How to remove "noise" from GCC/clang assembly output? for general tips aimed at looking at the compiler's asm text output, moreso than executable size.)

See also GCC + LD + NDISASM = huge amount of assembler instructions (ndisasm doesn't handle metadata at all, only flat binary, so it "disassembles" metadata. So the answer there includes info on how to avoid other sections.)

GCC -Wl,--build-id=none will avoid including a .note.gnu.build-id section in the executable.

$ nasm -felf32 foo.asm
$ gcc -m32 -static -nostdlib -Wl,--build-id=none -Wl,--nmagic foo.o
$ ll a.out 
-rwxr-xr-x 1 peter peter 488 Dec 26 18:47 a.out
$ strip a.out 
$ ll a.out 
-rwxr-xr-x 1 peter peter 248 Dec 26 18:47 a.out

(Tested on x86-64 Arch GNU/Linux, NASM 2.15.05, gcc 10.2, ld from GNU Binutils 2.35.1.)


You can check on the sections in your executable with readelf -a a.out (or use a more specific option to only get part of readelf's large output.) e.g. before stripping,

$ readelf -S unstripped_a.out
...
 Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        08048060 000060 00000c 00  AX  0   0 16
  [ 2] .symtab           SYMTAB          00000000 00006c 000070 10      3   3  4
  [ 3] .strtab           STRTAB          00000000 0000dc 000021 00      0   0  1
  [ 4] .shstrtab         STRTAB          00000000 0000fd 000021 00      0   0  1

And BTW, you definitely do not want to use nasm -felf64 on a file that uses BITS 32, unless you're writing a kernel or something that switches from 64-bit long mode to 32-bit compat mode. Putting 32-bit machine code in a 64-bit object file is not helpful. Only ever use BITS when you want raw binary mode to work (later in that tiny-ELF tutorial). When you're making a .o to link, it only makes it possible to shoot yourself in the foot; don't do it. (Although it's not harmful if you do properly use nasm -felf32 that matches your BITS directive.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • You forget other things like asynch unwind tables (in the .eh_frame section), build notes, and comments section. Asking the person what sections and code they has generated would likely tell us why it is 12k. What you suggest here may only deal with the tip of the iceberg. We had a similar question here a number of months ago: https://stackoverflow.com/questions/62775154/gcc-ld-ndisasm-huge-amount-of-assembler-instructions. Build notes would be a product of using GCC and LD. I suspect a `-Wl,--build-id=non` may help with GCC being used as a front end to LD. – Michael Petch Dec 26 '20 at 22:39
  • @MichaelPetch: I was already updating my answer about `.eh_frame` - you only get that if you link C compiler output. I just tried on my desktop: I do get `.note.gnu.build-id` though. But that should match the tutorial the OP is following, which moves on to `strip` and later to hand-generated ELF program headers, then to tucking the text segment inside those headers :P – Peter Cordes Dec 26 '20 at 22:43
  • See my edit I show how to get rid of the build-ide . The comments section may onyl be removable with something like `objcopy`. But I am guessing as to all the potential sections that may have been generated. Strip doesn't get rid of all sections, like `.comments` as far as I remember. I am thinking I commented on a more recent question the last 30 days on this subject as well. – Michael Petch Dec 26 '20 at 22:45
  • @MichaelPetch: Thanks, updated with example output. – Peter Cordes Dec 26 '20 at 22:49
  • Worked for me using 64bits it landed at 704 bytes – Eric Stotch Dec 26 '20 at 22:53
  • Also FYI the link you sent me yesterday helped. I got all the syscall stuff working and I'm going to revisit it to make sure I understand every line in the __asm__ __volatile__ block – Eric Stotch Dec 26 '20 at 22:54