9

I am following a tutorial to write a hello world bootloader in assembly and I am using the NASM assembler for an x-86 machine. This is the code I am using :

[BITS 16]   ;Tells the assembler that its a 16 bit code
[ORG 0x7C00]    ;Origin, tell the assembler that where the code will
            ;be in memory after it is been loaded

MOV SI, HelloString ;Store string pointer to SI
CALL PrintString    ;Call print string procedure
JMP $       ;Infinite loop, hang it here.


PrintCharacter: ;Procedure to print character on screen
;Assume that ASCII value is in register AL
MOV AH, 0x0E    ;Tell BIOS that we need to print one charater on screen.
MOV BH, 0x00    ;Page no.
MOV BL, 0x07    ;Text attribute 0x07 is lightgrey font on black background

INT 0x10    ;Call video interrupt
RET     ;Return to calling procedure



PrintString:    ;Procedure to print string on screen
;Assume that string starting pointer is in register SI

next_character: ;Lable to fetch next character from string
MOV AL, [SI]    ;Get a byte from string and store in AL register
INC SI      ;Increment SI pointer
OR AL, AL   ;Check if value in AL is zero (end of string)
JZ exit_function ;If end then return
CALL PrintCharacter ;Else print the character which is in AL register
JMP next_character  ;Fetch next character from string
exit_function:  ;End label
RET     ;Return from procedure


;Data
HelloString db 'Hello World', 0 ;HelloWorld string ending with 0

TIMES 510 - ($ - $$) db 0   ;Fill the rest of sector with 0
DW 0xAA55           ;Add boot signature at the end of bootloader

I have some difficulty understanding how I can place the complete 'Hello World ' string into one byte using the db command. As I understand it , db stands for define byte and it places the said byte directly in the executable , but surely 'Hello World' is larger than a byte. What am I missing here ?

Neeraj
  • 137
  • 1
  • 2
  • 9
  • 3
    When a string appears in a `db` the string is broken up into each individual character automatically and stored in successive bytes. `HelloString db 'Hello World', 0` is broken up and treated as `HelloString db 'H','e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', 0` – Michael Petch Jan 14 '17 at 07:06
  • Related, in more detail for `dw` and `dd` for strings: [How are dw and dd different from db directives for strings?](https://stackoverflow.com/q/38860174) – Peter Cordes Feb 04 '21 at 02:04

1 Answers1

13

The pseudo instructions db, dw, dd and friends can define multiple items

db 34h             ;Define byte 34h
db 34h, 12h        ;Define bytes 34h and 12h (i.e. word 1234h)

They accept character constants too

db 'H', 'e', 'l', 'l', 'o', 0

but this syntax is awkward for strings, so the next logical step was to give explicit support

db "Hello", 0         ;Equivalent of the above

P.S. In general prefer the user-level directives, though for [BITS] and [ORG] is irrelevant.

Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124
  • `message: db 'hello, world!', 10` It's unclear to me that how can a single variable `message` used to stored so many independent bytes in the string 'hello world' (make `message` like an array in high level language. – Thang Nguyen Aug 21 '21 at 14:24
  • 2
    @ThangNguyen Variables are just names for addresses (or, before the program is loaded into memory, for offsets/location counters). So in `message: db "hello, world!", 10`, `message` is a name for the address of the first `db` element (`h` in this case). All other elements are laid out sequentially after `h` and, of course, they also take space. So the effect is as `message` is an array of chars/bytes. But really, there are no types in assembly and every variable can act (or cannot act as) an array. – Margaret Bloom Aug 21 '21 at 17:26