2

I wrote a simple "Hello world" in assembly under debian linux:

; Define variables in the data section
SECTION .data
    hello:     db 'Hello world!',10
    helloLen:  equ $-hello

; Code goes in the text section
SECTION .text
GLOBAL _start 

_start:
    mov eax,4            ; 'write' system call = 4
    mov ebx,1            ; file descriptor 1 = STDOUT
    mov ecx,hello        ; string to write
    mov edx,helloLen     ; length of string to write
    int 80h              ; call the kernel

    ; Terminate program
    mov eax,1            ; 'exit' system call
    mov ebx,0            ; exit with error code 0
    int 80h              ; call the kernel

After assembling

nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o.

I got a 9048 byte binary.

Then I changed two lines in the code: from .data to .DATA and .text to .TEXT:

SECTION .DATA
SECTION .TEXT

and got a 4856 byte binary.
Changing them to

SECTION .dAtA
SECTION .TeXt

produced a 4856 byte binary too.

NASM is declared to be a case-insensitive compiler. What is the difference then?

Sep Roland
  • 33,889
  • 7
  • 43
  • 76

1 Answers1

5

You're free to use whatever names you like for ELF sections, but if you don't use standard names, it becomes your responsibility to specify the section flags. (If you use standard names, you get to take advantage of default flag settings for those names.) Section names are case-sensitive, and .data and .text are known to NASM. .DATA, .dAta, etc. are not, and there is nothing which distinguishes these sections from each other, allowing ld to combine them into a single segment.

That automatically makes your executable smaller. With the standard flags for .text and .data, one of those is read-only and the other is read-write, which means that they cannot be placed into the same memory page. In your example program, both sections are quite small, so they could fit in a single memory page. Thus, using non-standard names makes your executable one page smaller, but one of the sections will have incorrect writability.

rici
  • 234,347
  • 28
  • 237
  • 341
  • 1
    The `ld` default is to page-align segments, hence being about 4k larger when you have sections that go into 2 separate segments. [Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?](https://stackoverflow.com/q/65037919) . Also related: [Segmentation fault with a variable in SECTION .DATA](https://stackoverflow.com/q/54134394) - showing that `.DATA` doesn't get read/write permission, and showing `readelf -a` output confirming that `.TEXT` and `.DATA` go into the same segment. – Peter Cordes Aug 19 '22 at 11:48
  • The observed behavior can be explained with rici’s explanation and [the documentation](https://www.nasm.us/xdoc/2.15.05/html/nasmdoc8.html#section-8.9.2), in particular: “Any section name other than those in the above table is treated by default like `other` in the above table. Please note that section names are case sensitive” – Kai Burghardt Oct 09 '22 at 14:31