1

I'm trying to build a small bootloader that loads the second sector of a floppy disk.

The bootloader I started is pretty basic:

  • I print a hello world message.
  • Load the second sector.

I compile it and put along with a sample. I run qemu with gdb, and see if the I could jump to the loaded code.

here is how I compile:

nasm -f bin boot.asm -o boot.bin
nasm -f bin sample.asm -o sample.bin

dd if=/dev/zero of=disk.img bs=512 count=2880
dd conv=notrunc if=boot.bin of=disk.img bs=512 count=1 seek=0
dd conv=notrunc if=sample.bin of=disk.img bs=512 count=1 seek=1

Here is how I run qemu: qemu-system-i386 -fda build/disk.img -nographic

here is my code:

boot.asm:

org 0x7c00
bits 16

;; msg db "Hello world", 0ah, 0h --> EDIT: REMOVED


start:
  jmp setup

setup:
    xor ax, ax
    mov ds, ax
    mov bx, 0x7c00

    cli
    mov ss, ax
    mov sp, bx
    sti

    cld

boot:

  ;; call print_startup --> EDIT: REMOVED

  mov ax, 0x7e0

  mov es, ax
  xor bx, bx
load:
  mov al, 2                     
  mov ch, 0                     
  mov cl, 2                     
  mov dh, 0 
  mov dl, 0 

  mov ah, 0x2 
  int 0x13 
  jc load                
  jmp 0x7E0:0x0   ; jmp 0x7E00 works


times 510 - ($ - $$) db 0

dw 0xaa55

sample.asm

print_startup: --> Edited
    mov  si, msg
    mov  bx, 0x0007     ; DisplayPage 0, GraphicsColor 7 (White)
    lodsb               ; We don't have empty messages, do we?
.more:
    mov  ah, 0x0E
    int  0x10
    lodsb
    test al, al
    jnz  .more
    ret

msg db "Hello world", 13, 10, 0

this line jmp 0x7E0:0x0 is supposed to jump at the address 0x7E0 << 4 + 0x0 = 0x7E00 right ? But when running it, it is jumping somewhere else.

Looking in GDB, the address where it is jumping is 0x7E00000, i.e. adding 4 * 0 after.

However if I replace it this line by jmp 0x7E00 or jmp 0x0:0x7E00 it works perfectly (Which actually make more sense to me since it is segment(0x0):offset(0x7E00)), but doesn't correspond to what it is explained in the internet and more specifically in stack-overflow.

Moreover, I am trying to print the message hello world using int 0x10 but doesn't seem to locate properly the message and or al, al results to 0 on the first iteration.

Can please someone assist me on this and tell me what am I doing wrong? I'm pretty sure there are points that I'm missing and therefore I'm doing something silly. `

EDIT I edited the code to add the correction from the answer, but left the bootloader as the question was asked to avoid confusion.

The issue I am having now is that it still not jumping on the right address for some reason even with the code provided by Sep Roland.

Flyfe
  • 33
  • 5
  • I once had an interview question (in the late 1990s). It asked do I know how to do flat addressing. I had no idea, so I asked, only to discover it was just addressing. I asked what the alternative is. They explained the hoop jumping of the i386. Some of this came back with PAE (but in a way that did not affect the application code). I have never needed to worry about it. That said, I am pretty sure that the page size is not 16 bytes. – ctrl-alt-delor Mar 12 '23 at 19:54
  • 2
    The `msg` data certainly shouldn't be the first thing in your loader, as it will be executed as if it was code. And you shouldn't set `dl` to zero unconditionally for the int 13h service 02h call, just leave what the ROM-BIOS initialised it as. And `global` is at best useless in `-f bin` format. That said `gdb` with `qemu` doesn't properly support segmentation from what I've heard. The far jump does seem correct as written. Use a better debugger to trace your program, like mine or the one in bochs. – ecm Mar 12 '23 at 20:24
  • 2
    As to the message you are missing `org 7C00h` to fit your init of `ds` to zero. – ecm Mar 12 '23 at 20:26
  • @ctrl-alt-delor: real mode doesn't have paging at all, only segmentation. A segment:offset logical address corresponds to linear address `segment*16 + offset`, giving 8086 the ability to access 1 MiB of RAM, with 20 bit linear/physical addresses. A segment can start at any 16-byte alignment boundary (aka "paragraph" alignment). If you don't want to use all 64KiB, it's totally normal for another segment register to be not too far away. It's not at all like paging with 64K pages; don't think of using it that way. (Or better, don't think of it at all unless you like retrocomputing, as you say) – Peter Cordes Mar 12 '23 at 20:59
  • 1
    @ecm: [Assembly (x86): db 'string',0 does not get executed unless there's a jump instruction](https://stackoverflow.com/q/30561366) is the canonical duplicate for bootloaders that put `db` data in the path of execution. (vs. [Segmentation fault when using DB (define byte) inside a function](https://stackoverflow.com/q/55642600) under a protected-mode OS where running garbage instructions will typically be detected.) – Peter Cordes Mar 12 '23 at 21:01
  • I believe Qemu/GDB has known issues with consistently displaying linear addresses computed from segment:offset. GDB's design just isn't very compatible with the features of x86 real mode. Bochs is often recommended as an emulator with a better debugger for low level code. So the "wrong address" may just be an artifact of the debugger, and the real problem could be something else. – Nate Eldredge Mar 12 '23 at 23:46
  • Just a note. In the old days while a paragraph was considered 16 byte aligned, a page was considered 256 byte aligned. – Michael Petch Mar 13 '23 at 04:35
  • @MichaelPetch In terms of the 86-DOS MZ exe header a page is 512 bytes. – ecm Mar 13 '23 at 12:00
  • @ecm in terms of how Microsoft (and the MASM assemblers) defined a page for alignment purposes it is 256 or 16 paragraphs. – Michael Petch Mar 13 '23 at 12:04
  • Thank you guys for your answers. It looks like the `org 7C00h` sneaked out from my copy/paste (updating the question). Anyway, it looks like gdb is confused after the jump then. The msg is half printed when the jump is con – Flyfe Mar 13 '23 at 20:07

1 Answers1

3
  • Execution in a bootloader starts from the top. You should never place data items there. See Assembly (x86): <label> db 'string',0 does not get executed unless there's a jump instruction. Just move your message near the bottom right above the times directive.

  • In the absence of an ORG directive, the assembler assumes ORG 0. This will not match with your initialization of DS=0. As a consequence mov si, msg will not setup correctly and you will not see the message displayed. What you need is ORG 0x7C00 at the top of the program.

  • The print routine forgets to advance the SI pointer. You will get stuck in an infinite loop! In below code I have solved this using lodsb.

  • You state that you load the second sector, but code like load: mov al, 2 is asking BIOS to load 2 sectors (sector 2 and sector 3). Make sure this is what you need.

  • Going for perfection, the jump in code like jmp setup setup: is redundant since execution can just fall through.

Looking in GDB, the address where it is jumping is 0x7E00000

Don't worry. Many a time offset (low word) and segment (high word) are displayed separately but adjacent. What you then see is the true operand from your far jump instruction jmp 0x7E0:0x0.


bits 16
org 0x7C00

  xor ax, ax
  mov ds, ax
  mov ss, ax
  mov sp, 0x7C00
  cld

  call print_startup

  mov  ax, 0x7E0
  mov  es, ax
  xor  bx, bx
load:
  mov  dh, 0            ; Use DL like BIOS passed it to your bootloader
  mov  cx, 0x0002
  mov  ax, 0x0201 
  int  0x13             ; -> AX CF
  jc   load             ; Better put a limit on this, say max 3 tries
  jmp  0x07E0:0x0000

print_startup:
    mov  si, msg
    mov  bx, 0x0007     ; DisplayPage 0, GraphicsColor 7 (White)
    lodsb               ; We don't have empty messages, do we?
.more:
    mov  ah, 0x0E
    int  0x10
    lodsb
    test al, al
    jnz  .more
    ret

msg db "Hello world", 13, 10, 0

times 510 - ($ - $$) db 0

dw 0xAA55

For BIOS, the newline code is actually carriage return (13) plus linefeed (10).

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • Thank you very much for your answer, I apparently missed the org while copy pasting (updated the question). It looks like GDB is confused then, however, once corrected the print, I moved the code to 0x7e00, but not printing, not sure if the jump is still wrong or something else again. I update the question. – Flyfe Mar 13 '23 at 22:31
  • 2
    @Flyfe In your code/data you load at 0x7e00 I recommend you use an ORG of 0x7e00 at the top of the assembly file (the one you aren't showing) and use jmp 0x0000:0x7e00 in your boot sector rather than jmp 0x07e0:0x0000. If the code for the other sectors has been added onto the end (after `dw 0xAA55`) of the assembly file you are showing us then you'll just need to change to jmp 0x0000:0x7e00. – Michael Petch Mar 13 '23 at 22:38
  • Then the reason for the weird address is because of GDB being confused apparently. Also what the org does exactly ? Is it setting up registers for me ? – Flyfe Mar 13 '23 at 22:50
  • 1
    @flyfe: ORG tells nasm where in physical address space the code will begin at. Without it NASM has no idea what offsets to start things at (and it defaults ORG to 0x00000). Addresses in 16-bit real mode are made up of a segment:offset pair. Physical address from a segment:offset pair = (segment<<4)+offset. The offset should be equal to ORG and the segments you set in the assembly code should be equal to the segment. 0x0000:0x7e00=physical address 0x07e00 where you set an ORG to 0x7e00 and the segments to 0x0000. If you use 0x07e0:0x0000 the segment registers need to be 0x7e00 and org 0x0000. – Michael Petch Mar 14 '23 at 00:22
  • 1
    @flyfe: How you far jump to the destination determines what code segment and offset to use. So `jmp 0x07e0:0x0000` sets CS to 0x7e0 and starts loading at offset 0x0000 in that segment. You'd have to set your segment to 0x07e0 (like DS) to match once you jump and have an ORG of 0x0000. I recommend far jumping with `jmp 0x0000:0x7e00` which goes to the same location but CS is set to 0x0000 and the offset it starts at is 0x7e00. You'd set your DS (and other segments) to 0x0000 and use an ORG of 0x7e00. – Michael Petch Mar 14 '23 at 00:22
  • 2
    @flyfe: Getting a good understanding of segment:offset addressing is crucial to understand what is going on. This is a good guide: https://thestarman.pcministry.com/asm/debug/Segments.html – Michael Petch Mar 14 '23 at 00:22
  • @MichaelPetch I see ! it makes sens to me now thank you for the reference :) – Flyfe Mar 14 '23 at 22:02