2

I'm experimenting with assembly language and wrote a program which prints 2 hardcoded bytes into stdout. Here it is:

section .text
     global _start

_start:
     mov eax, 0x0A31
     mov [val], eax
     mov eax, 4
     mov ebx, 1
     mov ecx, val
     mov edx, 2

     int 0x80

     mov eax, 1
     int 0x80

 segment .bss
     val resb 1;   <------ Here

Note that I reserved only 1 byte inside the bss segment, but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location. And the program worked fine. It printed 1 character and then newline.

But I expected segmentation fault. Why isn't it occured. We reserved only 1 byte, but put 2.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
St.Antario
  • 26,175
  • 41
  • 130
  • 318
  • 1
    sounds like undefined behaviour saved your butt – Chris Turner Nov 08 '17 at 13:19
  • 6
    Because there is still memory mapped after your reserved byte. Page granularity is 4096 bytes on common machines. – Ctx Nov 08 '17 at 13:20
  • @Ctx It means by reserving a byte I actually reserve the whole page? – St.Antario Nov 08 '17 at 13:20
  • @St.Antario Kind of, yes. On common systems you always reserve at least a whole page. Which doesn't mean, that reserving one byte and then another means that you reserve two pages – Ctx Nov 08 '17 at 13:21
  • @Ctx Is that OS-dependent? Or depends on how CPU manages memory i.e. architecture dependent? – St.Antario Nov 08 '17 at 13:24
  • @St.Antario Both, an OS can only do what the architecture supports. If it supports paging, then you are stuck with page granularity. If it supports segmentation, you can map the memory on the byte exactly. If it supports both (as with current intel architecture), the OS can choose what to support (usually paging is supported then, as for example with linux) – Ctx Nov 08 '17 at 13:26
  • @Ctx Thx for the clarification. In my case Im on Ubuntu with core i7. So it means I got paging granularity? Kind of undefined behavior? – St.Antario Nov 08 '17 at 13:27
  • @St.Antario How large a page is depends on your architecture (MMU) and what the OS is using (e.g. "huge pages"). In any system with virtual memory, your get's assigned memory *by page*. –  Nov 08 '17 at 13:28
  • 3
    @St.Antario No, it is well defined when taking the whole environment into account (hardware and os). – Ctx Nov 08 '17 at 13:29
  • 1
    If you add `resb 4094` ahead of `val:`, does it crash? (4094+1 = 4095, so it should still look as worth only single page for assembler+linker, but then the 4 byte write should access the another one). Then again, for some weird reason (linker script/etc) your executable may reserve 2+ pages in such case already, so the crash is not guaranteed (you can use probably something like `objdump` or map file from linker to check meta data of binary, how much space is set for particular section). – Ped7g Nov 08 '17 at 17:12
  • 1
    @Ped7g Just tried... No it does not crash. – St.Antario Nov 08 '17 at 17:14
  • 2
    @Ped7g: turns out the BSS doesn't necessarily start at the beginning of a page. See the update to my answer for how to do what you suggested. `nm` is the most useful tool for checking symbol addresses. – Peter Cordes Nov 08 '17 at 18:02

1 Answers1

4

x86, like most other modern architectures, uses paging / virtual memory for memory protection. On x86 (again like many other architectures), the granularity is 4kiB.

A 4-byte store to val won't fault unless the linker happens to place it in the last 3 bytes of a page, and the next page is unmapped.

What actually happens is that you just overwrite whatever is after val. In this case, it's just unused space to the end of the page. If you had other static storage locations in the BSS, you'd step on their values. (Call them "variables" if you want, but the high-level concept of a "variable" doesn't just mean a memory location, a variable can be live in a register and never needs to have an address.)


Besides the wikipedia article linked above, see also:


but actually put 2 bytes (charcode for 1 and newline symbol) into the memory location.

mov [val], eax is a 4-byte store. The operand-size is determined by the register. If you wanted to do a 2-byte store, use mov [val], ax.

Fun fact: MASM would warn or error about an operand-size mismatch, because it magically associates sizes with symbol names based on the declaration that reserves space after them. NASM stays out of your way, so if you wrote mov [val], 0x0A31, it would be an error. Neither operand implies a size, so you need mov dword [val], 0x0A31 (or word or byte).


Placing val at the end of a page to get a segfault

The BSS for some reason doesn't start at the beginning of a page in a 32-bit binary, but it is near the start of a page. You're not linking with anything else that would use up most of a page in the BSS. nm bss-no-segfault shows that it's at 0x080490a8, and a 4k page is 0x1000 bytes, so the last byte in the BSS mapping will be 0x08049fff.

It seems that the BSS start address changes when I add an instruction to the .text section, so presumably the linker's choices here are related to packing things into an ELF executable. It doesn't make much sense, because the BSS isn't stored in the file, it's just a base address + length. I'm not going down that rabbit hole; I'm sure there's a reason that making .text slightly larger results in a BSS that starts at the beginning of a page, but IDK what it is.

Anyway, if we construct the BSS so that val is right before the end of a page, we can get a fault:

... same .text

section .bss
dummy:  resb 4096 - 0xa8 - 2
val:    resb 1

;; could have done this instead of making up constants
;; ALIGN 4096
;; dummy2: resb 4094
;; val2:   resb

Then build and run:

$ asm-link -m32 bss-no-segfault.asm
+ yasm -felf32 -Worphan-labels -gdwarf2 bss-no-segfault.asm
+ ld -melf_i386 -o bss-no-segfault bss-no-segfault.o

peter@volta:~/src/SO$ nm bss-no-segfault
080490a7 B __bss_start
080490a8 b dummy
080490a7 B _edata
0804a000 B _end         <---------  End of the BSS
08048080 T _start
08049ffe b val          <---------  Address of val

 gdb ./bss-no-segfault

 (gdb) b _start
 (gdb) r
 (gdb) set disassembly-flavor intel
 (gdb) layout reg

 (gdb) p &val
 $2 = (<data variable, no debug info> *) 0x8049ffe
 (gdb) si    # and press return to repeat a couple times

mov [var], eax segfaults because it crosses into the unmapped page. mov [var], ax would works (because I put var 2 bytes before the end of the page).

At this point, /proc/<PID>/smaps shows:

... the r-x private mapping for .text
08049000-0804a000 rwxp 00000000 00:15 2885598                            /home/peter/src/SO/bss-no-segfault
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
...
[vvar] and [vdso] pages exported by the kernel for fast gettimeofday / getpid

Key things: rwxp means read/write/execute, and private. Even stopped before the first instruction, somehow it's already "dirty" (i.e. written to). So is the text segment, but that's expected from gdb changing the instruction to int3.

The 08049000-0804a000 (and 4 kB size of the mapping) shows us that the BSS only has 1 page mapped. There's no data segment, just text and BSS.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • It might sound strange, but I want to thank you for the explanation to get the segmentation fault :). I use `nasm` and compile the source with `-elf64` option. The bss segment starts with the beginning of the page `0000000000601000`. After placing some dummy content I got `0000000000601ffa b val`. And then `mov [val] rax` segfault. – St.Antario Nov 09 '17 at 09:11
  • The thing is depending on the dummy size, the bss segment may or may not start with the beginning. If `dummy db 0xFFE` then `00000000006000db B __bss_start`. If `dummy db 0xFFA` then `0000000000601000 B __bss_start`. Can you explain that? Is that assembler/linker specific? Compile and link as follows `nasm -f elf64 segfult.asm ld segfault.o` – St.Antario Nov 09 '17 at 09:14
  • @St.Antario: In 64-bit code, don't use the `int 0x80` interface unless you've read and understood [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/questions/46087730/what-happens-if-you-use-the-32-bit-int-0x80-linux-abi-in-64-bit-code). I assumed 32-bit code because you're using the 32-bit ABI (which truncates everything including pointers to 32-bit). – Peter Cordes Nov 09 '17 at 13:35
  • 1
    @St.Antario: I assume you mean `dummy resb 0xFFE`? `db` assembles one byte into the output, and you can't put non-zero values in the BSS anyway. I noticed weirdness with the BSS start being the start of a page or not even just from adding a `mov [val], ax` in the `.text` section. You should ask a new question about this if you're curious, I don't know the answer. – Peter Cordes Nov 09 '17 at 13:38
  • 1
    And BTW, no it doesn't sound weird at all to say thanks for the explanation of how to put something at the end of a page. Seemed like a great way to demonstrate page-level memory protection. Some compilers actually do this on purpose in debug mode to detect out-of-bounds accesses to arrays. IIRC, some JavaScript JIT engines use it for bounds checking on array (because sandboxing requires bounds checking, and this gets the HW to do it for you so it's fast in the common case where there's no out-of-bounds access. They catch SIGSEGV in a signal handler.) – Peter Cordes Nov 09 '17 at 13:44