3

I have the following x86 program:

mov ah, 0x0e          ; Set up call to BIOS routine to print character

mov al, [character]   ; Stick the byte at label "character"
int 0x10              ; Display character in al

jmp $                 ; Loop forever

character:
db 0x41               ; Put the byte "A" at this position

times 510-($-$$) db 0 ; Pad with zeros and end with the magic number for a bootloader
db 0x55
db 0xaa

I'm running it in two different ways:

  • In qemu
  • Writing it to a USB stick with dd and booting it on an old 64 bit laptop

I use the following commands to run this code:

$ nasm -f bin -o boot.bin main.s
$ qemu-system-x86_64 boot.bin  # to test
$ dd if=boot.bin of=/dev/sda # to put it on a USB stick

The code as written above doesn't work in either case. On the hardware it displays a blinking cursur, and on qemu it prints a Cyrillic letter, rather than "A". So I change the second (non-empty) line to

mov al, [0x7c00 + character]

adding the offset 0x7c00 to the label, since according to some sources x86 puts your bootloader at 0x7c00 in memory. This works as expected in qemu, but continues to give me a blinking cursor on the hardware. Note that this has the same effect as putting [org 0x7c00] at the top, by which I mean the binaries produced by using the above line or by adding an org directive are identical (I compared their md5s).

To make sure my hardware doesn't have some weird character set where 0x41 isn't an "A", I tried

mov al, 0x41

and that works on both qemu and the hardware.

How can I properly reference the data stored at "character" so that my laptop will find the value that's supposed to be there? Note that because this is a bootloader the CPU is (if I understand correctly) in 16-bit real mode.

Jack M
  • 4,769
  • 6
  • 43
  • 67
  • 2
    Presumably because you left out `org` and setting `ds`, so your assembler doesn't know the right offsets for absolute memory operands, and there's no guaranteed correct offset. – Peter Cordes Jul 25 '20 at 11:41
  • 1
    Please post a [mcve] instead of just snippets. Other users need to be able to assemble and run your code in order to help you. – fuz Jul 25 '20 at 11:42
  • @fuz That's not a snippet, that's the complete code. Or should I post the shell commands I'm using to compile and run it? – Jack M Jul 25 '20 at 12:23
  • @PeterCordes So how would an actual bootloader go about doing this? Or would real code just not use this trick of storing constants in the source code? [The PDF](https://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf) I was following does this kind of thing and seemed to imply that the code being loaded at `0x7c00` was some kind of x86 standard. – Jack M Jul 25 '20 at 12:31
  • @JackM If that's your full code, you are missing the `org` directive causing your memory operands to go to the wrong addresses. Posting the commands used to assembly and run the code can't hurt, either. – fuz Jul 25 '20 at 12:55
  • 1
    @fuz I edited in my commands. I tried the `org` directive before and it had the same results as my attempt with the explicit addition of `0x7c00`. In fact, as I just edited into the question, using `org` produces a byte-per-byte identical binary to my explicit addition with no `org`. – Jack M Jul 25 '20 at 13:10
  • 2
    Ensure you set the DS register to 0x0000 if you use `org 0x7c00`. Each memory reference that doesn't involve `BP` register defaults to DS:. If you don't set DS you may not reference the right memory. See my [bootloader tips](https://stackoverflow.com/a/32705076/3857942). This can be a real problem on real hardware. If using USB using Floppy Drive Emulation (USB FDD) then you may need a [BIOS Parameter Block](https://stackoverflow.com/a/47320115/3857942). – Michael Petch Jul 25 '20 at 13:26
  • @JackM Use an `org` directive anyway instead of manually adding offsets. That said, yes, you might need to set up segment registers in your boot loader. – fuz Jul 25 '20 at 13:37
  • @MichaelPetch Thank you very much! Zeroing out DS worked (no need for a parameter block on my hardware it seems). I actually tried that earlier after noticing the `ds:` prefix in the disassembled binary, but NASM refused to assemble the line `mov ds, 0`, so I figured I must just be misunderstanding something. Thanks to some code posted in one of your links I realized you have to do `mov ax, 0` and `mov ds, ax`. You can post that as an answer if you like - if you want to pad it out a little, I'd also love an explanation as to why you can't just write a constant to DS. – Jack M Jul 25 '20 at 13:38
  • As for not loading a constant to a segment register directly it is simply a restriction of the architecture.There is no MOV from an immediate value to a segment register: https://www.felixcloutier.com/x86/mov – Michael Petch Jul 25 '20 at 14:08
  • Aren't you supposed to put the text display page in `bh`? And the data segment appears to be the same as the code segment, so `mov ax, cs` and `mov ds, ax`. – Weather Vane Jul 25 '20 at 15:37
  • @WeatherVane I'm aware of the code page issue but it seems to work fine without touching BX, at least on my machine. Not sure what you mean about the CS register. Is CS -> DS more portable than 0 -> DS like I'm doing? – Jack M Jul 25 '20 at 15:49
  • Of course it is. What you did only 'works' if the `cs` register happens to be `0` and there is no guarantee of that otherwise there would be no point in having segment registers. And just because sometimes `bh` *happens* to be `0` does not mean you don't have to set it. With those kind of presumptions your code will be very fragile. – Weather Vane Jul 25 '20 at 15:52
  • 2
    CS isn't guaranteed to be a particular value (although most common values are 0x07c0 and 0x0000). You shouldn't copy it to DS. Set the segment registers to the value you need and don't rely on them being a particular value. – Michael Petch Jul 25 '20 at 15:53

2 Answers2

3

The x86 features several segment registers containing memory offsets. In real mode (and other modes?), these registers are implicitly added to any memory reference you make. Which segment register is used depends on the context (in other words what kind of instruction the address is used in). In our case, when we try to get data from memory with

mov al, [character]

the processor will implicitly add the contents of the ds (for "data segment") register (multiplied by 16) to the memory offset character. Note that this happens at runtime, not compile-time, so you won't see this in your binary if you disassemble it.

The solution is to zero out ds at the top of the assembly program. However, note that you can't actually just say mov ds, 0 because x86 doesn't support writing constants to segment registers - you have to go via another register as in

mov ax, 0
mov ds, ax

For completeness, this is the full updated code which works on both my laptop and QEMU. Differences from the code in the question are commented below.

mov ax, 0  ; Zero out the data segment register
mov ds, ax ;

mov ah, 0x0e

mov al, [0x7c00 + character] ; Add 0x7c00 to the offset
                             ; As mentioned in the question, putting ORG 0x7C00 at the top of the file
                             ; also works (and is better, but this is clearer for demonstration purposes)
                             ; and in fact produces an identical binary to this explicit addition.
int 0x10

jmp $

character:
db 0x41

times 510-($-$$) db 0
db 0x55
db 0xaa

Clearly, what was happening here is that the ds register was zero by default on QEMU but not on my hardware. A real bootloader written by a professional would always zero out explicitly this sort of thing rather than assuming the BIOS put the registers in any particular state before loading its code.

If you've been reading "Writing a Simple Operating System - from Scratch" by Nick Blundell like me, he actually talks about this stuff a little later on in section 3.6.1 ("Extended memory access using segments"). Unfortunately I got stuck on this several pages before that and didn't read ahead.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
Jack M
  • 4,769
  • 6
  • 43
  • 67
  • Very well explained +1. I would have loved to see you add the `BH` DisplayPage parameter for the BIOS.Teletype call. – Sep Roland Jul 26 '20 at 21:08
  • 1
    @SepRoland True, probably best to set that (to zero, I suppose). Setting it on my machine didn't actually seem to have any effect (whether to zero or to some other value). – Jack M Jul 26 '20 at 22:03
  • "The x86 features several *segment registers* containing memory offsets. In real mode (and other modes?)," In Real/Virtual 86 Mode the segment registers hold values which are used to compute the segment bases directly (as per the original name, Real Address Mode). Otherwise, the segment registers hold selectors which are indices into the GDT or LDT, which hold descriptors. Loading a selector value into a segment register causes the base, limit, type, etc to be set from the descriptor as selected by the selector. – ecm Aug 26 '20 at 20:18
-1

Maybe You lost some parameters and ORG command

Try This

org 0x7c00            ; Tell NASM ,This program begin at Address 0x7c00

mov ah, 0x0e          ; Set up call to BIOS routine to print character

mov al, [character]   ; Stick the byte at label "character"

mov bh,0              ; The PAGE 0

mov bl,0xff           ; White

int 0x10              ; Display character in al

jmp $                 ; Loop forever

character:

db 0x41               ; Put the byte "A" at this position

times 510-($-$$) db 0 ; Pad with zeros and end with the magic number for  a bootloader

db 0x55

db 0xaa
AlanCui
  • 137
  • 9
  • 1
    If You don't use the "org" command (org meaning origin) The NASM will think the pogram is begin at 0x000000 And ....You Know , the character's address will be worng TIP: the 0x7c00 is a standard address and DONOT change that – AlanCui Jul 26 '20 at 06:41