2

I have the following 'uppercaser.asm' assembly program in NASM which converts all lowercase letters input from user into uppercase:

section .bss
    Buff resb 1

section .data

section .text
        global _start

_start:
        nop            ; This no-op keeps the debugger happy

Read:   mov eax,3      ; Specify sys_read call
        mov ebx,0      ; Specify File Descriptor 0: Standard Input
        mov ecx,Buff   ; Pass offset of the buffer to read to
        mov edx,1      ; Tell sys_read to read one char from stdin
        int 80h        ; Call sys_read

        cmp eax,0       ; Look at sys_read's return value in EAX
        je Exit         ; Jump If Equal to 0 (0 means EOF) to Exit
                        ; or fall through to test for lowercase
        cmp byte [Buff],61h  ; Test input char against lowercase 'a'
        jb Write        ; If below 'a' in ASCII chart, not lowercase
        cmp byte [Buff],7Ah  ; Test input char against lowercase 'z'
        ja Write        ; If above 'z' in ASCII chart, not lowercase
                        ; At this point, we have a lowercase character
        sub byte [Buff],20h  ; Subtract 20h from lowercase to give uppercase...
                        ; ...and then write out the char to stdout
Write:  mov eax,4       ; Specify sys_write call
        mov ebx,1       ; Specify File Descriptor 1: Standard output
        mov ecx,Buff    ; Pass address of the character to write
        mov edx,1       ; Pass number of chars to write
        int 80h         ; Call sys_write...
        jmp Read        ; ...then go to the beginning to get another character

Exit:   mov eax,1       ; Code for Exit Syscall
        mov ebx,0       ; Return a code of zero to Linux
        int 80H         ; Make kernel call to exit program

The program is then assembled with the -g -F stabs option for the debugger and linked for 32-bit executables in ubuntu 18.04.

Running readelf --segments uppercaser for the segments and readelf -S uppercaser for the sections I see a difference in size of text segment and text section.

readelf --segments uppercaser

Elf file type is EXEC (Executable file)
Entry point 0x8048080
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x000db 0x000db R E 0x1000
  LOAD           0x0000dc 0x080490dc 0x080490dc 0x00000 0x00004 RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text
   01     .bss

readelf -S uppercaser

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        08048080 000080 00005b 00  AX  0   0 16
  [ 2] .bss              NOBITS          080490dc 0000dc 000004 00  WA  0   0  4
  [ 3] .stab             PROGBITS        00000000 0000dc 000120 0c      4   0  4
  [ 4] .stabstr          STRTAB          00000000 0001fc 000011 00      0   0  1
  [ 5] .comment          PROGBITS        00000000 00020d 00001f 00      0   0  1
  [ 6] .shstrtab         STRTAB          00000000 00022c 00003e 00      0   0  1
  [ 7] .symtab           SYMTAB          00000000 0003d4 0000f0 10      8  11  4
  [ 8] .strtab           STRTAB          00000000 0004c4 000045 00      0   0  1

In the sections description one can see that the size of .text section is 5Bh=91 bytes (the same number one is getting with the size command) whereas in the segments description we see that the size is 0x000DB, a difference of 128 bytes. Why is that?

From the elf man pages for the Elf32_Phdr (program header) structure:

p_filesz This member holds the number of bytes in the file image of the segment. It may be zero.

p_memsz This member holds the number of bytes in the memory image of the segment. It may be zero.

Is the difference somehow related to the .bss section?

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
Nick_h
  • 21
  • 4

1 Answers1

5

Notice that the first program segment at file address 0 starts at virtual address 0x08048000, not at VA 0x08048080 which corresponds with the .text section.

In fact the segment displayed by readelf as 00 .text covers ELF file header (52 bytes), alignment, two program headers (2*32 bytes) and the netto contents of .text section, alltogether mapped from file address 0 to VA 0x08048000.

vitsoft
  • 5,515
  • 1
  • 18
  • 31
  • Looks like an old version of `ld` (in the OP's Ubuntu 18.04) made this executable; newer `ld` will page-align sections so data such as ELF headers doesn't get mapped into executable pages when it doesn't need to be. (So those bytes can't be ROP / Spectre gadgets.) [Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?](https://stackoverflow.com/q/65037919) / [Why an ELF executable could have 4 LOAD segments?](https://stackoverflow.com/a/57841768) – Peter Cordes Sep 11 '22 at 16:10