2

For example, I have the following code (MikeOS).

jmp short bootloader_start  ; Jump past disk description section
nop                         ; Pad out before disk description
...
...

OEMLabel            db "MIKEBOOT"   ; Disk label
BytesPerSector      dw 512          ; Bytes per sector
SectorsPerCluster   db 1            ; Sectors per cluster
ReservedForBoot     dw 1            ; Reserved sectors for boot record
NumberOfFats        db 2            ; Number of copies of the FAT

bootloader_start:
    mov ax, 07C0h           ; Set up 4K of stack space above buffer
    add ax, 544             ; 8k buffer = 512 paragraphs + 32 paragraphs (loader)   
    ...
    ...
....

Now, I know that jmp short bootloader_start means that it jumps past the OEMLabel... section and jumps to the label.

Since I am new to assembly, I have a couple of questions:

  • Does assembly allocate memory the moment you write the instructions? For example, in the last couple of lines, the code goes like:

      times 510-($-$$) db 0 ; Pad remainder of boot sector with zeros
      dw 0AA55h             ; Boot signature (DO NOT CHANGE!)
    
    buffer:                 ; Disk buffer begins (8k after this, stack starts)
    

    buffer: allocates memory?

  • In this code block:

    cli             ; Disable interrupts while changing stack
    mov ss, ax
    mov sp, 4096
    sti             ; Restore interrupts
    

    Why do we clear the Clear the Interrupts? If I am not wrong, this bit of code allocates 4096 bytes of stack.

    Finally, after the above block, we have this:

    mov ax, 07C0h           ; Set data segment to where we're loaded
    mov ds, ax
    

    Why do we do this? In my opinion, this is done to tell the Data Segment start where its origin is?

fuz
  • 88,405
  • 25
  • 200
  • 352
weirdpanda
  • 2,521
  • 1
  • 20
  • 31
  • `db` and so on put data in the data segment of an executable. http://stackoverflow.com/questions/31941830/how-are-arrays-initialised-to-zero-in-c-by-the-compiler/31942189#31942189. For stand-alone code like a bootable OS kernel, the sections just act to group data bytes separate from code bytes. Everything ends just ends up loaded into memory. I'm not sure how `bss` reserved space is handled for a bootable OS, as opposed to a Linux ELF binary (where the OS ELF loader code maps as much zeroed memory as the BSS says to). – Peter Cordes Aug 28 '15 at 21:59

2 Answers2

6
times 510-($-$$) db 0

It's a (NASM) assembler specific instruction that will fill the remaining of the free space up to 510 bytes with zeroes at the current offset in the binary (memory). The label it self will not create any . The only instruction that will create/allocate bytes are DB, DW, DD, DQ etc. It is no cpu instruction, but a kind of macro interpreted by the assembler programm.

Edit (What are labels?):

A label just represents an offset (address in memory or in binary file). Take following as example:

MyFirstLabel:
    db 1, 2, 3, 4
MySecondLabel:
    db 5, 6, 7, 8
Start:

If this is your assembler file and it is is loaded into memory at offset 0, it will look like following:

OFS   DATA
0000h: 01 02 03 04
0004h: 05 06 07 08

You will notice, that MyFirstLabel is only the offset where data is stored, in this case offset 0. MySecondLabel is just another offset, but it starts behind previous allocated data, in this case offset 4. My label Start for example represents offset 8 in file. So the "address" of this label is 0008h (relative to data/code segment) for example.

So in your case, if you fill the remaining memory up to 510 bytes with zeroes (this is was your times 510-($-$$) db 0 instruction is doing), allocating one additional data word (DW 0AA55h) then the offset of your buffer label is exactly 512 (0200h) (which is usually the size of the master boot record).

The cli instruction will tell the processor not to allow being interrupted until stiis called. This is important since the stack pointer register (sp) and stack segment register (ss) gets changed and this both instructions may not guaranteed to be uninterruptable. It means during changing one of these registers an interrupt may occur. In this situation the stack may be undefined/invalid. As hinted in comments to this posting, cli/sti for changing the stack segment and stack pointer isn't actually needed. See Intel documentation about "MOV" instruction:

Loading the SS register with a MOV instruction inhibits all interrupts until after the execution of the next instruction. This operation allows a stack pointer to be loaded into the ESP register with the next instruction (MOV ESP, stack-pointer value) before an interrupt occurs.

So following will be correct without cli/sti

mov bx, 4096
mov ss, ax 
mov sp, bx

And following is not correct

mov ss, ax 
mov ax, 4096
mov sp, ax

You are right, mov ds, ax will change the data segment register. The meaning of this register depends on if we are running in real mode, protected mode etc. You should search for "x86 segmented memory model" in your favorite search engine.

bkausbk
  • 2,740
  • 1
  • 36
  • 52
  • Sorry for being a little stupid, but `buffer is only a label past this allocated bytes.` What exactly do you mean? – weirdpanda Aug 27 '15 at 10:08
  • `mov` to `ss` implicitly disables interrupts until after the following instruction. So the `cli/sti` isn't actually needed in this case. http://stackoverflow.com/a/32086036/224132. Gotta love crazy x86 special-cases. – Peter Cordes Aug 28 '15 at 21:52
  • @PeterCordes: Yes this is of course correct, however then changing the stack have to be strictly in this order `mov ss,...` `mov sp, ...`, nothing in between. `cli/sti` would be much more safe here. But interessting fact, I din't know it :) – bkausbk Aug 28 '15 at 22:05
  • @bkausbk: `cli/sti` enables instructions regardless of whether interrupts were enabled or not beforehand. If you have a function that might be called in either an interrupts-disabled context or not, you need to avoid unconditionally enabling interrupts. Taking advantage of the ISA's interrupt-delaying semantics is obviously the correct choice in that case, with a comment pointing this out to prevent future breakage from separating the instructions. If you know interrupts are supposed to be enabled after your code, it's just a waste of a couple insns and however many clock cycles. – Peter Cordes Aug 28 '15 at 22:11
4

The memory is there all the time. buffer: just gives it a symbolic name.

Interrupts are disabled to protect from other processes disturbing something that might take several instructions. Specifically for setting up the stack you don't have to do this, as setting the ss register will disable interrupts for the next instruction. The intention being that you should have time to set sp as well.

Enabling interrupts when done might be a good idea anyway, as they might not have been enabled initially.

The ds register must be set to the segment assumed by the code you want to execute. Otherwise the program will not find your variables.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203