2

I've been trying to develop a small OS and managed to switch into protected mode, in order to write C code instead of assembly, but since this means I can't use interrupt 10h anymore, I have to write chars to the video memory address. So I tried creating a new print function to easily print out whole strings instead of printing each char separately. That's where the problems came in, for some reason, while printing single chars with the printchar function works, this new print function doesn't work, no matter what I try.

Here's my C Code:

void print(char* message, int offset);
void printChar(char character, int offset);

void start() {
    printChar('M', 2);
    print("Test String", 4);

    while (1) {

    }
}

void print(char* msg, int offset) {
    for (int i = 0; msg[i] != '\0'; i++)
    {
        printChar(msg[i], (i * 2) + offset);
    }
}

void printChar(char character, int offset) {
    unsigned char* vidmem = (unsigned char*)0xB8000;
    
    *(vidmem + offset + 1) = character;
    *(vidmem + offset + 2) = 0x0f;
}

I then use these commands to convert my code to binary and put it onto the second sector of a floppy disk with sectedit.

gcc -c test.c
objcopy -O binary -j .text test.o test.bin

Also here's the assembly code generated, when using objdump -d test.o

0000000000000000 <start>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 20             sub    $0x20,%rsp
   8:   ba 02 00 00 00          mov    $0x2,%edx
   d:   b9 4d 00 00 00          mov    $0x4d,%ecx
  12:   e8 73 00 00 00          call   8a <printChar>
  17:   ba 04 00 00 00          mov    $0x4,%edx
  1c:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 23 <start+0x23>
  23:   48 89 c1                mov    %rax,%rcx
  26:   e8 02 00 00 00          call   2d <print>
  2b:   eb fe                   jmp    2b <start+0x2b>

000000000000002d <print>:
  2d:   55                      push   %rbp
  2e:   48 89 e5                mov    %rsp,%rbp
  31:   48 83 ec 30             sub    $0x30,%rsp
  35:   48 89 4d 10             mov    %rcx,0x10(%rbp)
  39:   89 55 18                mov    %edx,0x18(%rbp)
  3c:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
  43:   eb 29                   jmp    6e <print+0x41>
  45:   8b 45 fc                mov    -0x4(%rbp),%eax
  48:   8d 14 00                lea    (%rax,%rax,1),%edx
  4b:   8b 45 18                mov    0x18(%rbp),%eax
  4e:   01 c2                   add    %eax,%edx
  50:   8b 45 fc                mov    -0x4(%rbp),%eax
  53:   48 63 c8                movslq %eax,%rcx
  56:   48 8b 45 10             mov    0x10(%rbp),%rax
  5a:   48 01 c8                add    %rcx,%rax
  5d:   0f b6 00                movzbl (%rax),%eax
  60:   0f be c0                movsbl %al,%eax
  63:   89 c1                   mov    %eax,%ecx
  65:   e8 20 00 00 00          call   8a <printChar>
  6a:   83 45 fc 01             addl   $0x1,-0x4(%rbp)
  6e:   8b 45 fc                mov    -0x4(%rbp),%eax
  71:   48 63 d0                movslq %eax,%rdx
  74:   48 8b 45 10             mov    0x10(%rbp),%rax
  78:   48 01 d0                add    %rdx,%rax
  7b:   0f b6 00                movzbl (%rax),%eax
  7e:   84 c0                   test   %al,%al
  80:   75 c3                   jne    45 <print+0x18>
  82:   90                      nop
  83:   90                      nop
  84:   48 83 c4 30             add    $0x30,%rsp
  88:   5d                      pop    %rbp
  89:   c3                      ret

000000000000008a <printChar>:
  8a:   55                      push   %rbp
  8b:   48 89 e5                mov    %rsp,%rbp
  8e:   48 83 ec 10             sub    $0x10,%rsp
  92:   89 c8                   mov    %ecx,%eax
  94:   89 55 18                mov    %edx,0x18(%rbp)
  97:   88 45 10                mov    %al,0x10(%rbp)
  9a:   48 c7 45 f8 00 80 0b    movq   $0xb8000,-0x8(%rbp)
  a1:   00
  a2:   8b 45 18                mov    0x18(%rbp),%eax
  a5:   48 98                   cltq
  a7:   48 8d 50 01             lea    0x1(%rax),%rdx
  ab:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  af:   48 01 c2                add    %rax,%rdx
  b2:   0f b6 45 10             movzbl 0x10(%rbp),%eax
  b6:   88 02                   mov    %al,(%rdx)
  b8:   8b 45 18                mov    0x18(%rbp),%eax
  bb:   48 98                   cltq
  bd:   48 8d 50 02             lea    0x2(%rax),%rdx
  c1:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  c5:   48 01 d0                add    %rdx,%rax
  c8:   c6 00 0f                movb   $0xf,(%rax)
  cb:   90                      nop
  cc:   48 83 c4 10             add    $0x10,%rsp
  d0:   5d                      pop    %rbp
  d1:   c3                      ret
  d2:   90                      nop
  d3:   90                      nop
  d4:   90                      nop
  d5:   90                      nop
  d6:   90                      nop
  d7:   90                      nop
  d8:   90                      nop
  d9:   90                      nop
  da:   90                      nop
  db:   90                      nop
  dc:   90                      nop
  dd:   90                      nop
  de:   90                      nop
  df:   90                      nop

edit: The problem basically lied in me not doing this on a linux distribution, with all the things I'd need to do to do it in Windows not properly set up, huge thanks to MichaelPetch who explained the problems to me, I've now switched to a linux VM and after slightly correcting the code, it works (as the comments pointed out my offset was weird, I used that offset as it worked in the broken setup I had, but normally it shouldn't).

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Malormar
  • 21
  • 4
  • So, this particular code displays `M` and nothing else? – Weather Vane Aug 04 '22 at 15:09
  • @WeatherVane yes, when I run my OS, there is still text from the bootloader, but the first character on screen has now been replaced by an M and the rest of the prior text has not been changed in any way. – Malormar Aug 04 '22 at 15:13
  • 1
    Would it be possible that the indexing in the two assignments is off by `1`? i.e. `*(vidmem + offset + 1) = character;` --> `*(vidmem + offset + 0) = character;` and `*(vidmem + offset + 2) = character;` --> `*(vidmem + offset + 1) = character;` – ryyker Aug 04 '22 at 15:24
  • Likely the address of the string literal is wrong, or worse, the constant is entirely omitted from the binary e.g. due to linker script problems. Yeah well you used `objcopy -O binary -j .text ` so that clearly does not copy the string which lives in `.rodata` typically. – Jester Aug 04 '22 at 15:25
  • @ryyker I am not 100% sure, but I don't think so. As I stated in a prior comment, there already is text on screen from the bootloader, so it would be overriden, even if just with the wrong information, but it is not modified at all (This prior text is also written in gray instead of white, which also doesn't change). – Malormar Aug 04 '22 at 15:28
  • I don't get why you are writing to offsets `1` and `2` either, (character and attribute) and not to offsets `0` and `1`. – Weather Vane Aug 04 '22 at 15:28
  • OS dev can be very tricky. Best skill you can learn is debugging (for example, `gdb` connecting to `QEMU`). If the issue is calculating offsets wrong in `printChar()` setting a breakpoint and examining the calculated values will probably cause the problem to jump right out at you. – sj95126 Aug 04 '22 at 15:31
  • @Jester I am sorry, I don't completely understand what I have to do to fix this, from what you've written, I'd have to modify the command I use, but as I've copied it from another tutorial, I don't quite understand how, as just replacing .text with .rodata results in an empty file. – Malormar Aug 04 '22 at 15:33
  • You need both, since `.text` contains the code :) That may or may not work while copying multiple sections from an object file since you are not even linking. – Jester Aug 04 '22 at 15:37
  • @Jester yes, but as I am not familiar with the command `objcopy`, I don't know how to modify it to get both, I've trie `objcopy -O binary -j .text .rodata test.o test.bin`, which doesn't work at all and then I've tried `objcopy -O binary -j .text -j .rodata test.o test.bin` which changes nothing from 'objcopy -O binary -j .text test.o test.bin', from what I've read in the help, `-j` means only section , but I don't know how to select multiple sections. – Malormar Aug 04 '22 at 15:44
  • 3
    Your title says you are using 32-bit protected mode. But your ELF output is 64-bit code. You need to build your code as 32-bit code, as you can't run 64-bit code in 32-bit protected mode and expect it to work properly. To run 64-bit code you would need to enter long mode (64-bit) – Michael Petch Aug 04 '22 at 17:04
  • 1
    Your offset calculations are incorrect in `printChar()`. You need to have the offset multiplied by two, then applied to `*(vidmem + double_offset + 0) = character;` and `*(vidmem + double_offset + 2) = 0x0f;`, respectively. Otherwise, you are iterating by one byte offsets, when the video text page is comprised of two byte cells. – Schol-R-LEA Aug 04 '22 at 17:10
  • I'd also recommend using a cross compiler, but another big issue you have is that you are compiling to an object file and then converting the object to a binary. For things to properly work you will need to link the objects to an an executable and then convert that to binary. You should at a minimum be compiling as freestanding using `-ffreestanding` and to generate 32-bit code with a 64-bit GCC compiler you'd use `-m32`. You will then need to run the linker (ld) of course. – Michael Petch Aug 04 '22 at 17:12
  • @MichaelPetch I am sorry, I don't know much about these commands, I've only managed to make cygwin and gcc work yesterday, so I don't know what exactly to do with your suggestion. I've tried `gcc -ffrestanding -m32 .c test.c`, which worked, but when I try `ld -s -o test.bin test.o` it gives back `ld: i386 architecture of input file 'test.o' is incompatible with i386:x86-64 output`, I've tried using `ld -s -m32 -o test.bin test.o`, but then I get `ld: unrecognised emulation mode: 32 Supported emulations: i386pep i386pe` and of those emulation modes neither go through, when I use them instead. – Malormar Aug 04 '22 at 17:25
  • Because you'd have to link with the `-melf_i386` option when using the linker (the linker would be trying to make a 64-bit app by default). You'd also have to specify the memory address that your code was loaded at in memory. Did you make your own bootloader or are you using GRUB? If you created your own you'd have to find out where the kernel is being loaded and then add something `-Ttext=0x#####` to the linker command line where ##### is the address in memory the kernel was loaded at. – Michael Petch Aug 04 '22 at 17:29
  • 2
    Oh you are using a windows compiler. That makes things even worse. On Windows you should be using a cross compiler, but even easier might be running WSL2 with Linux in it. Linux is far easier to do OSDEV work in. – Michael Petch Aug 04 '22 at 17:31
  • @MichaelPetch I have created my own bootloader, which is only one sector in size and jumps into the begining of the second sector, after switching to 32-bit protected mode, then I use sectedit to write the boot.bin in the first sector and the test.bin into the second sector. This is a solution that is definetly less than optimal, but it has evolved through trial and error, as many tutorials didn't work for me. Also `elf_i386pe` is also unrecognized. – Malormar Aug 04 '22 at 17:35
  • Yes, you are getting the errors because you are using a windows compiler (which doesn't understand elf). If you want to save your sanity you should build a elf GCC cross compiler or use Linux in WSL (windows subsystem for Linux). Linux is a whole lot easier and less painful for OS development. – Michael Petch Aug 04 '22 at 17:41
  • But on Windows with a MinGW or Cygwin gcc compiler you might be able to get away with `-mi386pe` instead of `-melf_i386`. – Michael Petch Aug 04 '22 at 17:43
  • @MichaelPetch when trying `ld -mi386pe -o test.bin test.o` I get `ld: i386:x86-64 architecture of input file 'test.o' is incompatible with i386 output`. Also while I think its to late to save my sanity, my best option would be to actually get to building a gcc cross compiler. So, I'll try following the osdev wiki article on that. Thank you for your help. – Malormar Aug 04 '22 at 18:00
  • Sounds like you didn't compile your .c file with `-m32` . The error suggests that your object file is still 64-bit. – Michael Petch Aug 04 '22 at 18:03
  • @MichaelPetch So I did, I don't know when or why I recompiled without `-m32`, but for some reason I did. But the resulting code starts with a lot of random and empty space as well as the message `This program cannot be run in DOS mode.` – Malormar Aug 04 '22 at 18:07
  • How are you running your OS? I assume you are using a virtualizer such as HyperV or QEMU, but not knowing which we cannot say just what you are doing. – Schol-R-LEA Aug 04 '22 at 18:12
  • @Schol-R-LEA I am running it on an old computer I have, which still has a floppy disk drive. – Malormar Aug 04 '22 at 18:14
  • Install WSL2 or VirtualBox, etc. Then create a VM and boot linux (eg Ubuntu). Do your development work there. Then create another VM and boot your OS there. This may seem like extra work but it's ultimately much easier as all the tools you're struggling with are already under linux by default – Craig Estey Aug 04 '22 at 18:24
  • @CraigEstey yeah that seems like the most sensible option right now, I got myself into this mess by following 50 different tutorials and sticking to the tools I already knew, instead of the tools that would be best. Which meant I had to implement workaround after workaround. – Malormar Aug 04 '22 at 18:31
  • Did you build with `-m32 -ffreestanding -c` when using GCC to compile the C file to an object file. Almost wonder if you left off the `-c` and compiled to a win32 executable directly which resulted in a full executable (that you then converted to a binary)? – Michael Petch Aug 04 '22 at 19:05
  • I also suspect that you will eventually (if you haven't already) run into problems with the binary being larger than you like if it contains a `.text`, `.data`, `.rodata` (or `.rdata` on windows compilers). Your use of `-j` which `objcopy` is very problematic and you'll likely be needing a basic linker script to reduce/eliminate the page alignment of 4096 bytes. I am saying this because you have also mentioned you only read one sector into memory. A lot of these problems can be solved by using GRUB to load the kernel, and that is much easier in WSL2/Linux. – Michael Petch Aug 04 '22 at 19:15
  • If you wanted to look at using Cygwin and GRUB I have an answer on doing that (including a linker script that would work in that environment) https://stackoverflow.com/a/49307575/3857942 – Michael Petch Aug 04 '22 at 19:22
  • @MichaelPetch I've decided on trying it on a linux vm, which I am currently setting up, as I've accepted, trying to continue this in windows will lead only to more problems, especially since most os dev tutorials are written for linux. – Malormar Aug 04 '22 at 19:23
  • Yep that was one of the first things I recommended. If you go that way then I also recommend just forgetting about your own bootloader and going with GRUB. GRUB puts your machine in 32-bit protected mode and will read your entire kernel from disk with no hassles. Then you can focus on your kernel. – Michael Petch Aug 04 '22 at 19:24
  • 1
    @MichaelPetch But still thank you very much for the help you provided, at least I somewhat understand what the commands do now and should be able to at least formulate my google search query correctly. – Malormar Aug 04 '22 at 19:25
  • @MichaelPetch I'll probably stick to my own bootloader, it already loads the other sectors and switches to protected mode (also this whole thing is pretty much a challenge for myself, which the bootloader is a part of). – Malormar Aug 04 '22 at 19:27
  • 1
    @MichaelPetch Thank you so much, I am finally finished setting everything up, wrote my own linker.ld compiled my c code to binary, put it on my floppy disk and put it in my old computer. Seeing the Words `Test String` was simply the best feeling, as I had been working on this Problem on and off for a year. While I am sure I'll run into other problems along the way, I finally feel like I can make progress again and this whole thing taught me more about the side of coding I try to avoid (setting up compilers and working in the cmdline/terminal a lot). – Malormar Aug 04 '22 at 22:53
  • No problem. I figured that the code you showed in the question was a version that attempted to fix the way the output was appearing on the screen incorrectly. Screen output being incorrect and people trying to resolve it by fiddling with the indexing is not uncommon to work around the problem when the underlying root cause is 64-bit code running in 32-bit mode. This question is similar in nature: https://stackoverflow.com/questions/39807710/unexpected-output-when-printing-directly-to-text-video-memory/ . I have an answer there which also addresses the 32-bit/64-bit problem. – Michael Petch Aug 05 '22 at 00:13
  • I am tempted to mark this question a duplicate of the other, although more complex because you were trying this with a native Windows GCC tool chain. – Michael Petch Aug 05 '22 at 00:17
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/247042/discussion-between-michael-petch-and-malormar). – Michael Petch Aug 05 '22 at 00:17

0 Answers0