1

So I found the following tutorial for building a simple bootloader: http://mikeos.sourceforge.net/write-your-own-os.html#firstos

Here is the start of his example and the area that I'm having trouble understanding:

start:
    mov ax, 07C0h       ; Set up 4K stack space after this bootloader
    add ax, 288         ; (4096 + 512) / 16 bytes per paragraph
    mov ss, ax
    mov sp, 4096

    mov ax, 07C0h       ; Set data segment to where we're loaded
    mov ds, ax

I understand that Mike here is trying to build a stack for his bootloader. The start of the program is at 07C0h in memory. What I don't understand is the following line 'add ax, 288 ; (4096 + 512) / 16 bytes per paragraph'. Why is he taking the total amount of the stack and boot sector, then dividing it by the 16-bit registers for the start of the stack segment? Shouldn't the stack segment start at 20h, right after the bootsector? Lastly, shouldn't the stack pointer then be set at the end of (512 + 4096)? Thanks

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
AaronV77
  • 109
  • 1
  • 11
  • 1
    Do you already know how segmentation works? BTW the code is at **7c00h** (double zero), 7c0h is the unique segment that starts at 7c00h. – Margaret Bloom Sep 02 '18 at 17:08
  • 1
    That code is broken. It's setting up a 4k hole which is then followed by a 4k stack. It's not setting up the stack directly after the loader as it says. PS: It's also silly calculating a constant at runtime. – Jester Sep 02 '18 at 17:19
  • I had a feeling that this example was bogus after doing further research into it. So would the correct way of doing this be the following: https://www.reinterpretcast.com/creating-a-bare-bones-bootloader – AaronV77 Sep 02 '18 at 17:26
  • Yes that code looks correct. Of course it's still quite silly calculating `7c0h+20h` with an `add` instead of just doing `mov ax, 7e0h`. – Jester Sep 02 '18 at 17:37
  • related: https://stackoverflow.com/q/52127149/4271923 ... yes, the stack segment may start at +20h (+512). The stack pointer then should be still 4096, if you want 4k for stack, it would be 512+4096 if you would use 07C0 as stack segment (as then the zero offset would point to the start of the code). With +20h the zero offset points right after the bootloader marker `0xAA55`, and you can't create offset which would hit the marker or below. – Ped7g Sep 02 '18 at 17:37
  • @Jester I read some where that they do it for the sake of readability / understanding for new comers like myself. – AaronV77 Sep 02 '18 at 17:50
  • Also thanks for the link @Ped7g, and thats what I was saying before that things were just not adding up with what they were doing. – AaronV77 Sep 02 '18 at 17:50
  • That's what you have comments for. But it's fine in a tutorial I guess as long as it doesn't mislead beginners into thinking it has to be this way. – Jester Sep 02 '18 at 17:53
  • 3
    The stack can be placed anywhere that won't interfere. In the early versions of MikeOS and his tutorial he had a 4k buffer for reading data into after the bootloader and then placed the stack pointer 4k above that. In the actual OS the value is 544 instead of 244 to extend the data buffer to 8k. In MikeOS that buffer is used to read the FAT12 root directory into (14 sectors worth of data). The method used in MikeOS is valid – Michael Petch Sep 02 '18 at 18:00
  • Since I have everyone here, can someone explain why the memory starts at 7C00h instead of zero? – AaronV77 Sep 02 '18 at 18:13
  • 2
    Because the real mode interrupt vector table and the BIOS data area are between 0x0000 and 0x0520. When the first BIOSes were created they assumed that DOS had a 32kb memory requirement so they read the boot sector to near the top of the 32kb mark and then placed the stack the BIOS used at start up above that growing down from 0x8000. It happened to be DOS had a 32kb minimum memory requirement but most PC sold were 64KB or greater. They never changed the original 0x7c00 location for backwards compatibility. – Michael Petch Sep 02 '18 at 18:19
  • @MichaelPetch that makes sense that the BIOS and interrupt vector table would be loaded into memory before the bootsector. But I still don't understand how the above MikeOS tutorial that I presented is valid? – AaronV77 Sep 02 '18 at 18:24
  • 2
    I'm not entirely sure why you think it is invalid. That code sets up this situation: https://i.stack.imgur.com/TGxxg.png .Not sure why that isn't valid. The blue is the stack area and the orange is the data area (including boot sector and 4kb buffer after that). That diagram has had the [20 bit segment:offset addressing](https://thestarman.pcministry.com/asm/debug/Segments.html) converted to physical addresses – Michael Petch Sep 02 '18 at 18:26
  • You can use memory pretty much anyway you wish as long as the stack you set doesn't interfere with your own code, that you don't load code below 0x520, don't load it above 0xa0000 and avoid the Extended BIOS Data Area just under 0xa0000 (usually 1KB on most machines - the size of the area can be queried by looking at data in the BIOS Data ) – Michael Petch Sep 02 '18 at 18:32
  • Interesting, I understand it now. All the other examples that I was reading made it sound the opposite, but why does he need such a large data segment? Just for giggles? – AaronV77 Sep 02 '18 at 18:53
  • 1
    Large data segment. Do you mean why he's using a segment of 0x7c0 instead of a segment of 0x0000. If that is what you are asking you may wish to look at the link in my earlier comment about 20 bit segment offset addressing. There is more than one way to address the same memory location. The formula is **physical address = (segment<<4)+offset** . 0x07c0:0x000 is phys address 0x07c0<<4+0x0000= 0x07c00 . 0x0000:0x7c00 is phys address (0x0000<<4)+0x7c00=0x07c00. Notice that both are the same memory location. – Michael Petch Sep 02 '18 at 19:00
  • I'm wondering if your confusion may be about not understanding 20-bit segment:offset addressing. MikeOS happens to use a non zero segments (other tutorial use a segment of 0x0000). As long as the segments you use address the memory you want it doesn't matter what you use. – Michael Petch Sep 02 '18 at 19:02
  • No why does he allocate an additional 4096 to the 512 boot sector. We don't use the original 512 bytes – AaronV77 Sep 02 '18 at 19:26
  • 2
    I told you above that he uses the 4k after the boot sector to store data (from disk etc). In the real Mike OS he actually changes it to 8k (he adds 544) and he uses that buffer to read the entire root directory of the FAT12 file system to. Then he places the stack after that. He's just decided that the 4k above the boot sector will be used for whatever purpose he wants, and his OS he reads more disk sectors to. Just happens to be in the small tutorial he doesn't actually use that memory for anything. In real MikeOS bootloader he does use it and the comments are clearer on that point. – Michael Petch Sep 02 '18 at 19:38
  • @Michael Petch: Link appears to be dead. – ecm Apr 10 '21 at 11:25

1 Answers1

0

You are right. With the current implementation, the code will leave a gap of 4k. The way to resolve this is what you said, which is to set ss to end of bootsector / 16 and to load sp with 4k (not 4k + 512). Hence the correct implementation would be.

mov ax, 07c0h
add ax, 32     ; (512 / 16) i.e end of bootsector
mov ss, ax
mov sp, 4096

Aside, from the question, i would also like to mention that this type of bootloaders won't work on real machines anymore and you should focus on standards like UEFI to build a bootloader. It also takes care of other tasks such as switching the cpu modes.

If you wish to see a working UEFI based bootloader with a minimalistic kernel refer to my github repo https://github.com/RohitRTdev/RTOS

  • 1
    The 4k gap is intentional, and not a problem that needs to be "resolved". See MichaelPetch's comments under the question. In fact 4k is apparently *necessary* for the use-case for this code: loading more stuff into it. (But apparently the real bootloader actually leaves an 8k gap.) Unless I'm misreading the comments; there is discussion of multiple gaps... Also, you can write `mov ax, 07c0H + 32` instead of doing addition of constants at run-time. – Peter Cordes Apr 10 '21 at 10:01