1

OK, before someone else marks this question as a duplicate. Let me make this very clear that this is more of a debugging problem than a logical problem. The logic is correct as far as I know because if I individually print the value in bx register after each operation, then I get correct output. The problem is that storing the results in bx register should make changes in the memory location it holds which is not happening.


So, I was learning assembly language these days, in NASM. I am following a pdf document which asks you to print a hexadecimal number (convert hex number to hex string and then print it).

I've written the code but it doesn't seem to print the correct hex number. On the other hand if I just print the variable FINAL_ST in the following code snippet without calling INIT (which is the start of the conversion of hex number to hex string), it works fine and prints 0x0000.

I've searched multiple times but to no avail.

I found out that gdb can be used to debug nasm programs but I could not understand how to use it when the output is a .bin file.

And I also tried constructing a Control Flow Graph for this code to understand execution flow but could not find an appropriate tool for it. :(

Code:

[org 0x7c00]

mov ax, 0x19d4
mov bx, FINAL_ST + 5

; jmp PRINTER ; works :/
jmp INIT

NUM:
    add dx, 0x0030
    mov [bx], dx
    jmp CONT

ALPHA:
    add dx, 0x0037
    mov [bx], dx
    jmp CONT

CONT:
    dec bx
    shr ax, 4
    cmp ax, 0x0000
    jne INIT
    je PRINTER

INIT:
    mov dx, 0x000f
    and dx, ax
    cmp dx, 0x000a
    jl NUM
    jge ALPHA       

;STRING PRINTER
PRINTER:
    mov bx, FINAL_ST
    mov ah, 0x0e
    jmp PRINT ; this doesn't work

PRINT:
    mov al, [bx]
    int 0x10
    inc bx
    cmp byte[bx], 0x00
    jne PRINT

FINAL_ST:
    db "0x0000", 0x00  

END:

times 510 - ($ - $$) db 0
dw 0xaa55

Commands used:

nasm boot_hex1.asm -f bin -o boot_hex1.bin

qemu-system-x86_64 boot_hex1.bin

I get the output as 0x1 while the expected output is 0x19D4.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
vishal-wadhwa
  • 1,021
  • 12
  • 18
  • 2
    Any particular reason why are you learning x86 assembly basics on bootloader binary? (it would make more sense to me to learn first basic x86 assembly in 32b linux (you can build+run+debug elf32 binaries in 64b linux too), then to learn about 16b specialities and limits and bootloaders). And you need debugger for qemu. Here is some Q about that, maybe it will help: https://stackoverflow.com/q/14242958/4271923 ... about your task: you are converting binary value, not hexadecimal. `mov ax, 0x19d4` will load `ax` with value `6612` encoded in binary into the 16 bits of register `ax`. – Ped7g Dec 01 '17 at 09:12
  • Everything "hexadecimal" about that value is only your formatting in the source code, after it is being assembled into machine code, that information is lost and irrelevant. CPU operates with bits, which are two levels of electrical current, often interpreted as 0 or 1 from programmer point of view. And `ax` has 16 of those "bits". There's nothing about format, just 16x zero or one. – Ped7g Dec 01 '17 at 09:15
  • @Ped7g . No, there's no specific reason for learning basics on bootloader. Actually I just googled OS development and started following [this](https://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf). I get your point that it is basically binary representation that we are converting to string(stored as hex representation). I guess it is a mistake on my part. What edits would you like me to make to the question? – vishal-wadhwa Dec 01 '17 at 09:21
  • And I tried executing those commands in the question you linked it. It just opened another window with title as `QEMU(Stopped)`. – vishal-wadhwa Dec 01 '17 at 09:23
  • I have some quick and dirty code that displays byes and words in HEX from within a bootloader using NASM. It was part of some test code in this [Stackoverflow answer](https://stackoverflow.com/a/47320115/3857942) under the section _Test Code to See if Your BIOS is Overwriting the BPB_ . There is a function `print_byte_hex` and `print_byte_word` that you might be able to draw inspiration from. It was designed to print out the address and bytes of the bootloader itself. – Michael Petch Dec 01 '17 at 09:26
  • For bootloaders - BOCHS is a very good tool for debugging. – Michael Petch Dec 01 '17 at 09:31
  • @MichaelPetch. Thank you. I'll try to understand that code snippet. And since I'm a beginner right now, after a little searching I realized that BOCHS required some initial setup of config file, so I decided to go with `qemu`. Thanks for the suggestion tho. :) – vishal-wadhwa Dec 01 '17 at 09:36
  • QEMU is much harder to use for real mode debugging. – Michael Petch Dec 01 '17 at 09:37
  • I have this SO Answer that discusses some of the issues of GDB/QEMU bootloader debugging: https://stackoverflow.com/a/32960272/3857942 – Michael Petch Dec 01 '17 at 09:39
  • You can launch BOCHS without a configuration file to boot a bin file from floppy drive A. An example that boots from the file `boot.bin` could look like `bochs -q 'boot:a' 'floppya: 1_44=boot.bin, status=inserted'` – Michael Petch Dec 01 '17 at 09:44
  • Guess I'll continue with BOCHS then, to debug my program. Thank you :) – vishal-wadhwa Dec 01 '17 at 10:15
  • Possible duplicate of [Printing Hexadecimal Digits with Assembly](https://stackoverflow.com/questions/3853730/printing-hexadecimal-digits-with-assembly) – David Hoelzer Dec 01 '17 at 13:05
  • @DavidHoelzer. I've seen that post like over a 100 times but it does not address what I am looking for so I've made some edits to the question. – vishal-wadhwa Dec 01 '17 at 14:06
  • Related: [How to convert a binary integer number to a hex string?](https://stackoverflow.com/q/53823756) if you want a working version. – Peter Cordes Nov 14 '21 at 21:52

2 Answers2

2

Your issue is on the two lines that look like this:

mov [bx], dx

This moves the 16-bit value in DX to the address specified in BX. Since x86 is little endian this has the effect of moving DL to [BX] and DH to [BX+1] on each iteration of your loop. Since DH is always zero in your code this has the effect of NUL terminating the string after each character is written to the FINAL_ST buffer.

The problem is that you are really looking at updating memory pointed to by BX with the byte in DL. Change both lines to be:

mov [bx], dl

I have a Stackoverflow answer with bootloader tips. Tip #1 is:

When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.

At a minimum you should set DS to zero since you are using an ORG (origin point) of 0x7c00. You can't assume the BIOS will set DS to zero before transferring control to your bootloader. It works in QEMU since its BIOS happens to have the value 0x0000 in DS already. Not all hardware and emulators will guarantee this.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • Also, this statement `mov [bx], dx`, does it mean that contents of `dh` (msb half of `dx`) were being moved to `[bx]`, that is why a blank(one of the control characters actually) character was being printed? – vishal-wadhwa Dec 01 '17 at 19:49
  • 1
    @vishal-wadhwa Yes, that's correct. Since x86 processors are little Endian it was writing the byte in _DL_ to [BX] and the byte in _DH_ into [bx+1] on each loop. _DH_ was always zero so it was like writing the character in _DL_ followed by a NUL terminating character. This is why your number appears truncated in your version. – Michael Petch Dec 01 '17 at 19:51
1

Here is a proc that has a working solution if someone needs it...

; Use: convert a hex value into a string
; Input: hex value(+10), string pointer(+12)
; Output: None
HEX_BINARY_LEN equ 4
ALPHA_MIN equ 000ah
ALPHA_ASCII equ 55
DECIMAL_ASCII equ 48
HEX_VALUE equ 10
STRING_PTR equ 12
;----------------------------------------------------------------
proc hexToString
push bp 
push bx
push ax
push dx
mov bp, sp 

    mov bx, [bp + STRING_PTR]
    add bx, 3 ; because we start from the end
    mov ax, [bp + HEX_VALUE]

digitLoop:
    mov dx, 000fh
    and dx, ax
    cmp dx, ALPHA_MIN
    ;----------------------------
    jge alphaDigit
        add dx, DECIMAL_ASCII
        mov [bx], dl
    jmp wasDecimalDigit
    ;--------------------
alphaDigit:
        add dx, ALPHA_ASCII
        mov [bx], dl

wasDecimalDigit:
    ;----------------------------

    dec bx
    shr ax, HEX_BINARY_LEN
    cmp ax, 0000h
jne digitLoop

mov sp, bp 
pop dx
pop ax
pop bx
pop bp 
retn 6
endp hexToString
;----------------------------------------------------------------
  • [How to convert a binary integer number to a hex string?](https://stackoverflow.com/q/53823756) has links to a couple 16-bit versions, and has a couple 32-bit scalar versions. – Peter Cordes Nov 14 '21 at 21:56
  • The standard frame-pointer setup for BP is `push bp` / `mov bp,sp` *before* any more pushes, so stack args are at a fixed offset from BP regardless of what else you do with SP in the prologue. Your first comment on how to call this says (+10) and (+12) but those offsets aren't relative to the return address, they're only meaningful with this specific BP setup. And BTW, in most calling conventions it's normal to let AX and DX be clobbered by functions, instead of spending extra instructions saving/restoring them. Also, you `retn 6`, but your function only takes 4 bytes of args. – Peter Cordes Nov 14 '21 at 21:59
  • Your constants could be defined in more meaningful ways, like `ALPHA_ASCII equ 'A' - 10` for example, and `'0'` for decimal. Naming is hard, but for `HEX_BINARY_LEN` I would have called it `BITS_PER_DIGIT` or something. And instead of `and dx, 0Fh`, use `and dx, BITS_PER_DIGIT - 1`, otherwise it's pointless to have this factored out as a named constant instead of hard-coded. – Peter Cordes Nov 14 '21 at 22:04
  • You don't need `mov [bx], dl` twice; put that instruction in the part at the end of the loop that runs, like right before `dec bx`. Also, instead of a whole instruction for `add bx, 3`, you could use `mov [bx+3], dl`. Also, `shr` sets FLAGS, so it's redundant to `cmp` with zero right after. – Peter Cordes Nov 14 '21 at 22:10
  • Also, if you swap your usage of AX and DX, you can save machine-code size: `add al, '0'` is only a 2-byte instruction, vs. 3 for `add dl, '0'` or `add dx, '0'`. The unconditional jmp could also be removed if you change the constants to `ALPHA_ASCII - DECIMAL_ASCII` or similar, so 10..15 go through two `add` instructions with no taken branch, but 0..9 have one taken branch and one add. (Or to keep your same cmp/jge, reverse it so it's two adds with the first one using a negative value, vs. a taken branch and one add ALPHA_ASCII – Peter Cordes Nov 14 '21 at 22:16
  • Thank you for all the comments I will update my code :) – reem_mikulsky Nov 14 '21 at 22:32