4

I'm developing a boot loader, which will boot into a simple kernel after switching into protected mode. I used this paper as a tutorial, somewhere in chapter four or five. In theory it is supposed to start in 16-bit real mode, load the kernel into memory, switch to 32-bit protected mode and start executing the kernel code.

However, when I switch into protected mode and far jump or jump to another segment, it triple faults. Here is the main boot sector code:

[org 0x7c00]

KERNEL_OFFSET equ 0x1000

mov [BOOT_DRIVE], dl    ;Get the current boot drive from the BIOS

mov bp, 0x9000          ;Set up stack, with enough room to grow downwards
mov sp, bp

mov bx, REAL_MODE_MSG
call print_string

call load_kernel

call switch_to_pm

jmp $                       ;Jump to current position and loop forever

%include "boot/util/print_string.asm"
%include "boot/util/disk.asm"
%include "boot/gdt/gdt.asm"
%include "boot/util/print_string_pm.asm"
%include "boot/switch_to_pm.asm"

[bits 16]
load_kernel:
    mov bx, LOAD_KERNEL_MSG ;Print a message saying we are loading the kernel
    call print_string
    mov bx, KERNEL_OFFSET       ;Set up disk_load routine parameters
    mov dh, 15
    mov dl, [BOOT_DRIVE]
    call disk_load              ;Call disk_load
    ret

[bits 32]
BEGIN_PM:
    mov ebx, PROT_MODE_MSG
    call print_string_pm
    call KERNEL_OFFSET

    jmp $

; Data
BOOT_DRIVE: db 0
REAL_MODE_MSG: db "Started in real mode.", 0
PROT_MODE_MSG: db "Successfully entered 32-bit protected mode.", 0
LOAD_KERNEL_MSG: db "Loading Kernel into memory", 0

; Bootsector padding
times 510-($-$$) db 0
dw 0xaa55

Here is the GDT:

;Global Descriptor Table
gdt_start:

gdt_null:   ; We need a null descriptor at the start (8 bytes)
    dd 0x0
    dd 0x0

gdt_code:   ; Code segment descriptor
    ; Base=0x0, Limit=0xfffff
    ; 1st flags : (present)1 (privilege)00 (descriptor type)1 -> 1001b
    ; type flags : (code)1 (conforming)0 (readable)1 (accessed)0 -> 1010b
    ; 2nd flags : (granularity)1 (32 - bit default)1 (64 - bit seg)0 (AVL)0 -> 1100b
    dw 0xffff       ; Limit (bits 0-15)
    dw 0x0      ; Base (0-15)
    dw 0x0          ; Base (16-23)
    db 10011010b    ; 1st flags and type flags
    db 11001111b    ; 2nd flags and Limit (16-19)
    db 0x0          ; Base (24-31)

gdt_data:   ; Data segment descriptor
    ;Same as CSD except for type flags
    ; (code)0 (expand down)0 (writable)1 (accessed)0 -> 0010b
    dw 0xffff       ; Limit (bits 0-15)
    dw 0x0          ; Base (0-15)
    dw 0x0          ; Base (16-23)
    db 10010010b    ; 1st flags and type flags
    db 11001111b    ; 2nd flags and Limit (16-19)
    db 0x0          ; Base (24-31)

gdt_end:


;GDT Descriptor
gdt_descriptor:
    dw gdt_end - gdt_start - 1
    dd gdt_start

;Some Constants
CODE_SEG equ gdt_code - gdt_start
DATA_SEG equ gdt_data - gdt_start

Here is the code for switching into protected mode, where it triple faults:

[bits 16]
switch_to_pm:
    cli
    lgdt [gdt_descriptor]   ; load the gdt
    mov eax, cr0            ; turn pm on
    or eax, 0x1
    mov cr0, eax
    jmp CODE_SEG:init_pm    ; THIS IS WHERE THE PROBLEM IS!

[bits 32]
init_pm:
    mov ax, DATA_SEG ; Point segment registers to the data
    mov ds, ax       ; selector defined in the gdt
    mov ss, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov ebp, 0x90000 ; Update our stack
    mov esp, ebp
    call BEGIN_PM ;Move on

When I place a jmp $ instruction to idle at a certain spot, right before the jmp CODE_SEG:init_pm instruction, it idles there and does not triple fault. When I place it after that instruction, within the label init_pm, it does triple fault. So I am fairly sure that it is the cause. I'm not too sure why, maybe it's an issue with the GDT. I am new to operating system development and boot loaders. Any suggestions on how to solve this problem?

Razor
  • 1,778
  • 4
  • 19
  • 36
adotout1
  • 43
  • 5
  • 3
    Are you sure your GDT is correct? I think the thing that stands out upon cursory look is that each of your entries is 9 byte (72 bits). A GDT entry is 8 bytes (64-bits). it appears that maybe you meant `db 0x0 ; Base (16-23)` instead of `dw 0x0 ; Base (16-23)`? Note the difference is that `dw` is changed to `db`. Wrong GDT entries would generate a triple fault. – Michael Petch Apr 12 '16 at 02:10
  • 1
    I'd also recommend looking at my [general bootloader tips](http://stackoverflow.com/questions/32701854/boot-loader-doesnt-jump-to-kernel-code/32705076#32705076) . You make the assumption that the DS (data segment) register is zero upon entry (since you use org 0x7c00). You should set it to zero explicitly. You also set the stack in an odd way. You set SP to 9000 but you don't set _SS_ which means you don't really know where you are putting the stack in memory. You should set the _SS_ register followed by setting the _SP_ register. My bootloader tips offer an example. – Michael Petch Apr 12 '16 at 02:25
  • @Michael Petch: Turns out SS is guaranteed to be zero. – Joshua Apr 12 '16 at 03:32
  • 1
    I'm with RossRidge. I know hardware going back decades didn't set _SS_ to zero. Some of them put the stack segment just below the highest available memory region, some put it in the first 64k. A lot of people still believe that a _CS_ of zero is required by the BIOS when it jumps to _CS:IP_ .In the 80's we ended up having boot disks failing to work unless you coded things to make no assumptions.The general rule of thumb now for development of bootloaders (and has been unwritten rule for decades) is to **assume nothing** except maybe the value in _DL_ (although a few rare BIOSes have _DL_ bugs) – Michael Petch Apr 12 '16 at 03:56
  • 1
    I can assure you @joshua, the SS is not always zero. If you want to take a look at hardware that does weird stuff regardless of standards, try writing a boot loader on Asus hardware. – David Hoelzer Apr 12 '16 at 10:36
  • @MichaelPetch Alright, I set DS to 0, tried setting SS to both 0x8000 and 0x9000 at the start of the bootloader, and fixed the length of the GDT entries. It's still not working. I honestly feel lost at this point. – adotout1 Apr 12 '16 at 13:20
  • I can only think of two possibilities here. That your _GDT_ is still wrong (make sure you actually assembled the code again). The issue with `dw 0x0 ; Base (16-23)` is in **BOTH** GDT entries (the code and data entry) - I hope you changed both places! The other possibility is that you are getting into protected mode and the triple fault is related to this line `call KERNEL_OFFSET` this assumes the kernel was loaded properly off the disk and loaded into memory. What happens if you comment that line out (making sure the `JMP $` is right after. Does your code still triple fault? – Michael Petch Apr 12 '16 at 14:15
  • Last night I looked over the code to figure out might be wrong. Today I took your code + found and used copies of the other files from your tutorial(the ones not present in your question). I built it up with my suggested changes (and proper GDT entries) and it switched into protected mode just fine. I also (as a test) had to comment out `call KERNEL_OFFSET` for the experiment since I didn't have a kernel placed onto my disk image for test purposes. It worked on Bochs and QEMU as it did print `Successfully entered 32-bit protected mode.` – Michael Petch Apr 12 '16 at 14:26
  • @David Hoelzer: What is this a floppy bootloader? The MBR sets it to zero. – Joshua Apr 12 '16 at 15:36
  • 1
    The MBR does no such thing unless you write code to actually set it. He is writing the MBR and, as you can see, there is no effort to adjust the stack segment register in his code. Perhaps you're thinking of what is normally done in a boot loader... For instance, here: http://starman.vertcomp.com/asm/mbr/STDMBR.htm – David Hoelzer Apr 12 '16 at 15:38
  • @Joshua as David points out the OP is writing his own Master Boot record into Sector 1 / Head 0 / Cylinder 0 (LBA=0). He must rely on the state (or lack thereof) setup by the BIOS. If he wants to chainload something after, and rely on the state his own MBR set then that is well within his prerogative because he is the developer of both pieces of code. – Michael Petch Apr 12 '16 at 16:27
  • @Michael Petch: I have made sure that i set both of the GDT entries accordingly, I commented out the call to the kernel and even commented out the routine for loading the kernel into memory just to be sure, and it still triple faults. Can you send me your working code so i might be able to change mine to match yours? – adotout1 Apr 12 '16 at 20:26
  • 1
    This question didn't have an answer that could be arrived at with the information given. I worked with the OP offline and provided him a set of files based on the document he linked that did work. The issue was apparently caused by one of the support files that was included. The OP couldn't give me his original file that didn't work, but he did inform me that the files I supplied him worked properly. You can download the [archive of files](http://www.capp-sysware.com/misc/stackoverflow/36562268/adotout1.tgz) that did work from my site. – Michael Petch May 25 '16 at 14:57

2 Answers2

2

The problem is with you jmp CODE_SEG:init_pm. In 16-bit mode it's a 4-bytes jump to 16-bit address as segment:offset. But you need to do 6-byte far jump to a 32-bit address. In fasm syntax it will be

jmp fword CODE_SEG:init_pm

This will add an operand size prefix 0x66 to the instruction and treat init_pm as 32-bit offset. Not sure how to achieve the same in nasm, but you get the idea.

Alexander Zhak
  • 9,140
  • 4
  • 46
  • 72
  • 2
    OPs _JMP_ instruction will work since the offset is within the lower 64kb (it can be expressed as a 16 bit offset). It uses the selector CODE_SEG and zero extends init_pm to a 32-bit address. In this case a 32-bit jmp is not necessary. – Michael Petch May 24 '16 at 07:31
  • Oh, but if it is loading a cs with zero base, where does it add (`realmode_cs << 4`) to the offset. – doug65536 May 29 '16 at 23:06
2

Michael Petch gave the correct answer to this question in the comments. Unfortunately this has seem to been missed by several people as there have now been three incorrect answers posted, two of them making the same mistake. Here then is his comment posted as answer in the hopes that it makes it more visible:

Are you sure your GDT is correct? I think the thing that stands out upon cursory look is that each of your entries is 9 byte (72 bits). A GDT entry is 8 bytes (64-bits). it appears that maybe you meant db 0x0 ; Base (16-23) instead of dw 0x0 ; Base (16-23)? Note the difference is that dw is changed to db. Wrong GDT entries would generate a triple fault.

Michael Petch also made a good followup comment that pointed out other problems with the bootloader:

I'd also recommend looking at my general bootloader tips. You make the assumption that the DS (data segment) register is zero upon entry (since you use org 0x7c00). You should set it to zero explicitly. You also set the stack in an odd way. You set SP to 9000 but you don't set SS which means you don't really know where you are putting the stack in memory. You should set the SS register followed by setting the SP register. My bootloader tips offer an example.

Community
  • 1
  • 1
Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
  • Sadly, none of those worked on their own. Michael gave me some code which I used and it worked. I still don't know exactly what the problem was besides those which he pointed out. – adotout1 Jun 01 '16 at 15:44