-1

I am trying to write a bootloader by myself, but qemu-system-i386 CPU will reset probabilistically. But bochs does not have this problem. Why?

my code is mbr.s and loader.s:

# cat mbr.s
%include "boot.inc"

SECTION MBR vstart=0x7c00         
   ...
   call rd_disk_m_16 ; rd_disk_m_16 is ok
  
   jmp LOADER_BASE_ADDR
   ...
   db 0x55,0xaa
# cat loader.s
   %include "boot.inc"
   section loader vstart=LOADER_BASE_ADDR
   LOADER_STACK_TOP equ LOADER_BASE_ADDR
   jmp loader_start
   ...
loader_start:

   cli
   lgdt [gdt_ptr]

   mov eax, cr0
   or eax, 0x00000001
   mov cr0, eax

   jmp  0x08:p_mode_start

[bits 32]
p_mode_start:
   jmp $

My step:

# bximage -func=create -hd=16M -imgmode="flat" -sectsize=512 -q hd.img
# nasm -I include/ -o mbr.bin mbr.s && dd if=mbr.bin of=./hd.img bs=512 count=1  conv=notrunc
# nasm -I include/ -o loader.bin loader.s && dd if=loader.bin of=./hd.img bs=512 count=4 seek=2 conv=notrunc
# qemu-system-i386 -hda hd.img -d cpu_reset,int -no-reboot
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
chenhao
  • 19
  • 6
  • When you attach GDB to the QEMU guest as a gdb remote, what do you see as you single-step? Also, doesn't QEMU log double triple faults or other reboot reasons? – Peter Cordes May 28 '23 at 16:59
  • Also your code is not public. – Jester May 28 '23 at 17:16
  • https://github.com/meilihao/learn_asm/tree/master/example/protect_mode – chenhao May 29 '23 at 01:54
  • maybe the problem is `times 60 dq 0`, because I didn't find information on how to check gdt. – chenhao May 29 '23 at 02:02
  • 1
    Maybe bochs's firmware executes bootloader with already disabled interrupts? Or maybe difference in other initial condition, e.g.`ss:sp`. – dimich May 29 '23 at 02:57
  • Your code doesn't set `DS` before using `lgdt [gdt_ptr]`. At least you haven't shown it in the code in your question. (I'm not interested in reading through your full code off-site; your question should be a [mcve] of your problem.) – Peter Cordes May 29 '23 at 03:20
  • I have initialized DS in mbr. And stackoverflow limit me to input too much code. – chenhao May 29 '23 at 04:54
  • @PeterCordes I tried gdb, but the disassembled instructions are wrong and can't be read. – chenhao May 29 '23 at 11:12
  • GDB doesn't know about segmentation and modes, so it's harder to use for this than Bochs' built-in debugger. You might have to manually override what mode it's using to disassemble, if you mean "wrong" as in opposite operand-size and wrong lengths from decoding 16-bit code as if it was 32-bit. Or if the bytes aren't there in memory at all, then maybe your code is buggy and didn't actually load the extra sectors. – Peter Cordes May 29 '23 at 11:37
  • 3
    I managed to load it in GDB/QEMI(see https://stackoverflow.com/questions/32955887/how-to-disassemble-16-bit-x86-boot-sector-code-in-gdb-with-x-i-pc-it-gets-tr) and it looks like your disk read has some kind of issue that manifests in such a way that the JMP at the beginning of the base load address (0x900) seems to get read into memory but the rest of the data from disk seems to get placed into memory in the wrong place. It is random in nature where the code that enters protect mode gets loaded and the FAR JMP ends up failing when it jumps to an address that doesn't have the expected code. – Michael Petch May 30 '23 at 16:00
  • Since you aren't using the BIOS to read the disk and you wrote your own code to access the HD directly you might want to look at timing related issues. – Michael Petch May 30 '23 at 16:02

1 Answers1

0

my first solution is loader2.s:

  1. delete times 60 dq 0
  2. add cli before lgdt

The problem has improved significantly.

Then move all variable definitions to the end of the file (loader_ok.s), and the problem disappears completely. This step is amazing, and it was discovered by accident. Can't use gdb to debug, so the specific reason is unknown.

chenhao
  • 19
  • 6