3

Some background:

I'm working on a basic bootloader that reads a secondary bootloader into memory with the BIOS INT 13h AH=02h interrupt. I've got it working in the emulators (Virtualbox, Qemu, and Bochs).

Subsequently, I added a BPB (BIOS parameter block) to my bootloader, made a bootable USB, and tested it on my real machine with USB Floppy Emulation (which I set-up in the configuration screen of the BIOS of my real machine). It worked like a charm.

After testing the bootloader on my own machine, I tested it on another, newer machine. This new computer did not have a floppy emulation option in its BIOS configuration and therefore could not boot from the USB drive. So, following this osdev wikipage, I added a partition-table at the end of the MBR so that the newer machine could boot from the USB.

The Problem:

With the added partition table code, the bootloader fails to load the secondary bootloader into memory and the BIOS INT 13h fails. I have no clue why this might happen, as I've not changed any of the actual bootloader code. I just added the 64bit MBR partition table and reading data into memory fails instantly.

BPB (BIOS parameter block) & Disk accessing routine

bits    16 
org 0x7C00

jmp start
nop
;------------------------------------------;
;  Standard BIOS Parameter Block, "BPB".   ;
;------------------------------------------;
     bpbOEM         db  'MSDOS5.0'
     bpbSectSize    dw  512
     bpbClustSize   db  1
     bpbReservedSe  dw  1
     bpbFats        db  2
     bpbRootSize    dw  224
     bpbTotalSect   dw  2880
     bpbMedia       db  240
     bpbFatSize     dw  9
     bpbTrackSect   dw  18
     bpbHeads       dw  2
     bpbHiddenSect  dd  0
     bpbLargeSect   dd  0
     ;---------------------------------;
     ;  extended BPB for FAT12/FAT16   ;
     ;---------------------------------;
     bpbDriveNo     db  0
     bpbReserved    db  0
     bpbSignature   db  41            
     bpbID          dd  1
     bpbVolumeLabel db  'BOOT FLOPPY'
     bpbFileSystem  db  'FAT12   '

drive_n: db 0
start: 
    mov [drive_n], dl

    ; setup segments
    xor ax, ax
    mov ds, ax
    mov es, ax

    ; setup stack
    cli
    mov ss, ax
    mov sp, 0x7C00   ; stack will grow downward to lower adresses
    sti

    ; write start string
    mov si, start_str    ; start_str = pointer to "Bootloader Found..."
    call write_str       ; routine that prints string in si register to screen 

    ; read bootstrapper into memory
    mov dl, [drive_n]; drive number
    mov dh, 0x00    ; head (base = 0)
    mov ch, 0x00    ; track /cylinder = 0
    mov cl, 0x02    ; (1= bootloader, 2=start of bootstrapper
    mov bx, 0x7E00  ; location to load bootstrapper 
    mov si, 0x04    ; number of attempts

    ; attempt read 4 times 
  read_floppy:
    ; reset floppy disk
    xor ax, ax
    int 0x13

    ; check if attempts to read remain, if not, hlt system (jmp to fail_read)
    test    si, si
    je  fail_read   ; *** This jump happens only on real machines with 
    dec si          ; USB hard drive emulation ***

    ; attempt read 
    mov ah, 0x02    ; select read
    mov al, 0x0F    ; num sectors
    int     0x13
    jc  read_floppy

    ...             ; continue onward happily! (without any errors)

MBR Partition Table

; 0x1b4
db "12345678", 0x0, 0x0     ; 10 byte unique id

; 0x1be         ; Partition 1 -- create one big partition that spans the whole disk (2880 sectors, 1.44mb)
db 0x80         ; boot indicator flag = on

; start sector
db 0            ; starting head = 0
db 0b00000001   ; cyilinder = 0, sector = 1 (2 cylinder high bits, and sector. 00 000001 = high bits db 0x00)
db 0            ; 7-0 bits of cylinder (insgesamt 9 bits) 

; filesystem type
db 1            ; filesystem type = fat12

; end sector = 2880th sector (because a floppy disk is 1.44mb)
db 1            ; ending head = 1
db 18           ; cyilinder = 79, sector = 18 (2 cylinder high bits, and sector. 00 000001 = high bits db 0x00)
db 79           ; 7-0 bits of cylinder (insgesamt 9 bits) 

dd 0            ; 32 bit value of number of sectors between MBR and partition
dd 2880         ; 32 bit value of total number of sectors

; 0x1ce         ; Partition 2
times 16 db 0

; 0x1de         ; Partition 3
times 16 db 0

; 0x1ee         ; Parititon 4
times 16 db 0

; 0x1fe         ; Signature
dw  0xAA55

The Question

What causes a failure in reading the disk if and only if USB hard disk drive emulation is enabled in the BIOS? I've tried changing up the partition table and the BPB but nothing seems to work. I bet it has something to do with the difference in how the computer handles floppy vs. hard drive information but it's hard to find any info on that.

Any help would be greatly appreciated. I didn't intend for this question to be so long; it just accumulated.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
travisjayday
  • 784
  • 6
  • 16
  • I removed my original comment believing it has nothing to do with the scenario here after I read yor code. I do notice that your Partition table has the entire disk including the MBR. (sector 1). Was that by design. When you boot of the system with USB HDD are you sure it actually started executing the code in the MBR? – Michael Petch Nov 25 '17 at 03:52
  • Yes I'm sure. I made it such that it writes the string "Bootloader found..." right after it starts executing in the MBR--and indeed, it prints the string. Yeah, I know what you mean about the partition starting in the MBR. I decided to do that because the osdev page that I linked suggested that "The MBR must have a partition table with an active partition with the boot loader starting in the active partition (in case the firmware doesn't support "floppy emulation")". So that's why I chose the partition to start in the MBR. I'm not sure if that's good though – travisjayday Nov 25 '17 at 03:56
  • As an experiment what happens if you attempt to read 1 sector. Instead of `mov al, 0x0F ; num sectors` try `mov al, 0x01` . I'm curious if that works – Michael Petch Nov 25 '17 at 04:28
  • Changing to `mov al, 0x01` has the same effect. The read error happens regardless :-/ – travisjayday Nov 25 '17 at 06:18
  • You don't show all your code but where do you place the stack and how do you initialize the _ES_ register? The read error occurring on a single sector read suggests one of the parameters is wrong. – Michael Petch Nov 25 '17 at 06:21
  • In fact I'd be curious about all your startup code. I hope you don't copy _CS_ to _DS_ and _ES_ for example. I'd like to see the code that was omitted before `; omitted setting up the stack, and segments ` – Michael Petch Nov 25 '17 at 07:04
  • Do you have a partition boot record in place? – David Hoelzer Nov 25 '17 at 11:24
  • I've added the code that sets up the segments and stack. @DavidHoelzer Yes, I believe so. That's the section labeled "MBR Partition Table" – travisjayday Nov 25 '17 at 16:57
  • I'm very curious. Where do you save the initial value of _DL_ into the variable `drive_n`. I'm almost wondering if you have accidentally hard coded the boot device number? Witht he way the code is structured I would have expected that after setting up the segment registers you would have had an instruction like `mov [drive_n], dl` to take the boot drive number and placed it into `drive_n`. Using the wrong boot drive number could cause these issues. USB HDD will generally have a boot drive number of 0x80 or higher and as a floppy between 0x0 and 0x7f. Maybe this is your problem all along. – Michael Petch Nov 25 '17 at 17:24
  • I'm sorry about that. I accidentally copy/pasted an older version of the code. In the latest version, I indeed have `mov [drive_n], dl` at the beginning (and it still doesn't work). Let me update the code in this question right now... – travisjayday Nov 25 '17 at 17:33
  • I'd like you to move `mov [drive_n], dl` AFTER you set up the segment registers, not before. It isn't guaranteed the BIOS transferred control to your bootloader with a _DS_ of 0. It makes a difference especially if _DS_ has something other than 0 in it already. `mov [drive_n], dl` is the equivalent to `mov [ds:drive_n], dl` so the value of the _DS_ register matters. – Michael Petch Nov 25 '17 at 17:39
  • You were right! it was the `drive_n` after all. Such a silly thing. I see how moving dl before setting up the segments is not safe because ds could be anything. The reason it worked in Floppy mode is that drive_n is set to 0 with `db 0` and the floppy drive number is also 0. So that was just luck. Well anyway, moving `mov [drive_n], dl` AFTER setting up segments works as expected. Thanks for your help. – travisjayday Nov 25 '17 at 17:50
  • @travisjayday The MBR and the PBR are different things. You may find that your hardware, for a USB to boot, must have a PBR. – David Hoelzer Nov 25 '17 at 17:58
  • @DavidHoelzer : That's not his issue. In his case his MBR and PBR are actually the same thing since he makes his partition span the entire disk including the MBR in the first sector (This is a trick to be most compatible with a multitude of BIOS boot methods). As he mentioned previously his code actually does boot but his code prints a failure when the disk read is done. – Michael Petch Nov 25 '17 at 18:00
  • @MichaelPetch Sounds good, please do. – travisjayday Nov 25 '17 at 18:07

1 Answers1

5

TL;DR : In some situations the boot drive is not properly stored at label drive_n. This causes the disk read routine to fail on some hardware.


I have a a Stackoverflow answer with a general set of bootloader tips. An important tip is this:

When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.

After your question was updated with more pertinent code with what happens before the read the issue becomes evident:

drive_n: db 0
start: 
    mov [drive_n], dl

    ; setup segments
    xor ax, ax
    mov ds, ax
    mov es, ax

    ; setup stack
    cli
    mov ss, ax
    mov sp, 0x7C00   ; stack will grow downward to lower adresses
    sti

The problem is that mov [drive_n], dl is done before the segment registers are set up. mov [drive_n], dl is equivalent to mov [ds:drive_n], dl. The segment in DS matters. If the BIOS transfers control to your bootloader with a DS segment that isn't 0x0000 then mov [drive_n], dl will write the drive number to a memory location you don't expect.

If the value of DS was not zero and the boot drive was something other than 0x00 then there was a good chance of failure. In cases where the real boot drive was stored to the wrong memory location, the initial value stored at the drive_n label would be used. In your case that was 0x00.

In most cases you got lucky it worked. The resolution to this problem is simple. Ensure you write the value of DL to memory after you set up the segment registers (most notably DS). The code should look like:

drive_n: db 0
start: 
    ; setup segments
    xor ax, ax
    mov ds, ax
    mov es, ax

    ; setup stack
    cli
    mov ss, ax
    mov sp, 0x7C00   ; stack will grow downward to lower adresses
    sti

    mov [drive_n], dl
Michael Petch
  • 46,082
  • 8
  • 107
  • 198