3

I have a simple program which moves some null-terminated strings to the bx register:

[org 0x7c00]    ; Tells the assembler where the code will be loaded
    mov ah, 0x0e

    mov bx, HELLO_MSG   ; Moves string inside of HELLO_MSG to bx
    call print_string   ; Calls the print_string function inside of ./print_string.asm

    mov bx, GOODBYE_MSG ; Moves string inside of GOODBYE_MSG to bx
    call print_string

    jmp $   ; Hangs

%include "print_string.asm"

HELLO_MSG:
    db 'Hello, World!', 0

GOODBYE_MSG:
    db 'Goodbye!', 0

    times 510-($-$$) db 0
    dw 0xaa55

and a print_string function inside of ./print_string.asm:

print_string:
    mov ah, 0x0e
    int 0x10
    ret

The print_string function doesn't work. From my understanding, ah has the value 0x0e stored, so if al has a value of X and int 0x10 is ran, it tells the BIOS to display the value of al on the screen. How would I replicate this for strings?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
ahsan-a
  • 204
  • 3
  • 9
  • 1
    Example: https://hg.ulukai.org/ecm/ldosboot/file/eb6941a617b1/iniload.asm#l383 (Takes a string pointed to by `ds:si`, can replace `lodsb` by `mov al, [bx]` \ `inc bx` to use `ds:bx` though.) – ecm Nov 26 '20 at 11:58
  • Your `print_string` function overwrites `ax` with the contents of `bx`, overwriting what you set up `ah` with. Is this really what you want? – fuz Nov 26 '20 at 12:50
  • 2
    You can find a `print_string` implementation in this answer: https://stackoverflow.com/a/59431113/3857942 – Michael Petch Nov 26 '20 at 13:34

3 Answers3

5
print_string:
  mov ah, 0x0e
  int 0x10
  ret

Your print_string routine uses the BIOS.Teletype function 0Eh. This function will display the single character held in the AL register. Since this BIOS function additionally expects you to supply the desired DisplayPage in BH and the desired GraphicsColor in BL (only for when the display is in a graphics video mode), it's perhaps not the best idea to use the BX register as an argument to this print_string routine.

Your new routine will have to loop over the string and use the single character output function for every character contained in the string. Because your strings are zero-terminated, you stop looping as soon as you encounter that zero byte.

[org 7C00h]

cld                  ; This makes sure that below LODSB works fine
mov  si, HELLO_MSG
call print_string
mov  si, GOODBYE_MSG
call print_string

jmp  $

print_string:
    push bx          ; Preserve BX if you need to!
    mov  bx, 0007h   ; DisplayPage BH=0, GraphicsColor BL=7 (White)
    jmp  .fetch
  .print:
    mov  ah, 0Eh     ; BIOS.Teletype
    int  10h
  .fetch:
    lodsb            ; Reads 1 character and also advances the pointer
    test al, al      ; Test if this is the terminating zero
    jnz  .print      ; It's not, so go print the character
    pop  bx          ; Restore BX
    ret
Sep Roland
  • 33,889
  • 7
  • 43
  • 76
3

I am almost certain you are asking this question in the context of reading this document by Nick Blundell on how to write your own OS from scratch, and trying to figure out question 4 on page 21.

Sep Roland's answer is great, but it's more advanced than what the author was trying to teach with the exercise. This point in the text hasn't covered graphics mode yet or the lodsb and test instructions. Blundell is going for something simpler that is trying to draw on your understanding of the tools you've been taught thus far in the reading: cmp, jmp, add, labels, and the various conditional jumps (je, jne, etc...).

My solution looks like this, which works well:

[org 0x7c00]     ; tell NASM what address this will be loaded at
; execution starts here
 mov ax, 0
 mov ds, ax      ; make segmentation agree with NASM about data addresses
; your code can start here, after the magic boilerplate

mov bx, HELLO_MSG
call print_string

mov bx, GOODBYE_MSG
call print_string

jmp $               ; infinite loop because there's nothing to exit to

print_string:
    pusha           ; preserve our general purpose registers on the stack
    mov ah, 0x0e    ; teletype function
.work:
    mov al, [bx]    ; move the value pointed at by bx to al
    cmp al, 0       ; check for null termination
    je .done        ; jump to finish if null
    int 0x10        ; fire our interrupt: int 10h / AH=0E
    add bx, 1       ; increment bx pointer by one
    jmp .work       ; loop back
.done:
    popa            ; pop our preserved register values back from stack
    ret

;; Data placed where execution won't fall into it
HELLO_MSG:
    db `Hello World!\n\r`, 0   ; use backticks to allow C style escape

GOODBYE_MSG:
    db 'Goodbye', 0

;; More boilerplate to make this a bootable MBR
times 510-($-$$) db 0    ; pad out to 510 bytes
dw 0xaa55                ; 2-byte signature so BIOS can recognize this as a bootable MBR

Again this answer can obviously be done better - I am just answering this in the way the author most likely intended you to learn here.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
riptusk331
  • 369
  • 4
  • 9
  • 1
    You forgot to set DS=0 (`xor ax,ax` / `mov ds,ax`) to match your `ORG 0x7c00` directive. A BIOS that runs the MBR with a different DS base will print garbage instead of your strings. See [Michael Petch's bootloader tips](https://stackoverflow.com/questions/32701854/boot-loader-doesnt-jump-to-kernel-code/32705076#32705076). – Peter Cordes Mar 11 '22 at 16:27
  • 1
    `[dx]` isn't a valid 16-bit addressing mode. `si` or `di` are valid choices that would let you set BL/BH as inputs for that `int 0x10` BIOS call without needing to `xchg bx, dx` twice per iteration if you were keeping a pointer in DX. – Peter Cordes Mar 11 '22 at 16:28
  • Terminology nitpick: backticks don't *escape* newlines, they tell NASM to *process* C-style escape codes. So a \n escape-sequence can *become* a newline. "Escaping a newline" would be an appropriate description if quotes started on one source line and ended on another, so there was an actual newline inside the quotes. So I'd comment that `; use backticks to allow C-style escape codes`. (In case any other future readers are wondering, `'\n'` in NASM (single or double quotes) is not a newline in the first place, it's just a literal backslash and n as separate ASCII codes.) – Peter Cordes Mar 11 '22 at 16:31
  • good point on the backticks/dx - i updated the answer. that said, i still believe my answer to be more inline with what the original author intended than the others. if you read the text, the techniques you are referring to (segmentation, index registers, xchg, lodsb, minutia of org 0x7c00 directive) had not been introduced when this question is asked. so i'm not sure how a student answering it would be expected to know those. – riptusk331 Mar 11 '22 at 16:46
  • 1
    A bootloader that fails on a significant number of real-world BIOSes is not a good example. `xor ax,ax` / `mov ds, ax` needs to be there, with a comment like "necessary boilerplate for static storage to be guaranteed to work / magic to be explained later". Just like in a Hello World in C before explaining what all the pieces are. Your code has an `org 0x7c00` and you aren't explaining that; those 2 instructions are just another necessary piece of the puzzle. I 100% get your point about avoiding tricky instructions and would like to upvote this answer if it wasn't broken. – Peter Cordes Mar 11 '22 at 16:53
  • Your answer already had a significant amount of boilerplate that's necessary for it to be a bootable MBR; I added some comments on those, as well as adding the necessary setup of DS to portably access static data. – Peter Cordes Mar 11 '22 at 17:07
  • fair enough. thanks for your input. i was going to update the answer but i see you did. i don't disagree with you - it's just not easy to know what to include/exclude in a pedagogical context. is someone reading this just trying to figure out how jumps work in this simple exercise, or actually creating an MBR that runs on every possible system? [Lying to children](https://en.wikipedia.org/wiki/Lie-to-children) as they say, is sometime the easiest way to teach. that said, you're advocating for good practice. – riptusk331 Mar 11 '22 at 17:16
  • 2
    i would also say take up issue with the original author! as they teach putting the org directive in without the additional instructions. and it's the top document that comes up on any google search for teaching yourself assembly. seems like it's an unfinished draft. – riptusk331 Mar 11 '22 at 17:17
1

This question is quite old but I'm using the same document I found on the web as you are so I thought I'd share the solution I found. The writer (Nick Blundell of School of Computer Science, University of Birmingham, UK) wanted you to use the register bx to store the memory address that points to the beginning of the string. Then after you print one value you increment bx and so on until zero. Here's my solution.

[org 0x7c00]

mov bx, HELLO_MSG
call print_string_mem

mov bx, GOODBYE_MSG
call print_string_mem

jmp $                   ; Hang
    
%include "print_string.asm"

; Data
HELLO_MSG:
    db 'Hello, World!', 0
    
GOODBYE_MSG:
    db 'Goodbye!', 0

times 510-($-$$) db 0
dw 0xaa55
print_string_mem:
    jmp test_mem
    
    test_mem:
        mov al, [bx]
        cmp al, 0
        je end_mem
        jmp print_mem
        
    print_mem:
        mov ah, 0x0e
        int 0x10
        add bx, 1
        jmp test_mem
        
    end_mem:
        ret

    ret

The answer above mine works, and is better suited to real development, but the author wanted you to use the tools you learned in the previous pages to formulate your answer. I wracked my brain thinking about how the hell I was supposed to get this answer until I read the docs where he mentions setting bx to the memory address of the message. Hope this helps someone who was in my place.

MikeySasse
  • 41
  • 5
  • 4
    That's the same loop, just using a different register and written less efficiently, without putting the conditional branch at the bottom [where it belongs](https://stackoverflow.com/questions/47783926/why-are-loops-always-compiled-into-do-while-style-tail-jump). (even including two useless `jmp`-next-instruction jumps, where execution would fall through anyway. (`jmp print_mem` and the first `test_mem`) – Peter Cordes May 23 '21 at 21:07
  • 1
    (You're also assuming that the BIOS has the screen in text mode, so BL and BH aren't inputs to int 0x10/AH=0x0E. That's normally fine, especially for toy programs / bootloaders.) – Peter Cordes May 23 '21 at 21:07
  • 2
    If you just wanted to follow the intended calling convention and use the pointer in BX, you only have to change `lodsb` into `mov al, [bx]` / `inc bx`, and remove the other code that messes with BX in Sep's answer. Thanks for adding context to what the question was about, though; that part is useful. – Peter Cordes May 23 '21 at 21:10
  • Thanks for the info, I'm still learning assembly so I wasn't aware of the fall through and I haven't read up on any conventions or looked at anything but whats in the doc itself. Thanks for the clarification though I'll be sure to look into these things. – MikeySasse May 23 '21 at 21:25
  • 2
    Labels are just markers in the assembly, not real gaps in the machine code. The CPU always just runs the next instruction at the address after the one it previously executed. It can't even see whether there was a label or not in the asm source; it's just a way to give a symbolic name to an address. And BTW, goto labels work the same way in C, perl, and most other languages with labels and gotos. – Peter Cordes May 23 '21 at 21:36
  • Thanks that looks very informative. It's hard finding good material these day. I appreciate the help. – MikeySasse May 27 '21 at 19:02