2

I have an OS project that I am working on and I am trying to call data that I have read from the disk in C with inline assembly.

I have already tried reading the code and executing it with the assembly call instruction, using inline assembly.

void driveLoop() {
    uint16_t sectors = 31;
    uint16_t sector = 0;
    uint16_t basesector = 40000;
    uint32_t i = 40031;
    uint16_t code[sectors][256];
    int x = 0;
    while(x==0) {
        read(i);
        for (int p=0; p < 256; p++) {
            if (readOut[p] == 0) {
            } else {
                x = 1;
                //kprint_int(i);
            }
        }
        i++;
    }
    kprint("Found sector!\n");
    kprint("Loading OS into memory...\n");
    for (sector=0; sector<sectors; sector++) {
        read(basesector+sector);
        for (int p=0; p<256; p++) {
            code[sector][p] = readOut[p];
        }
    }
    kprint("Done loading.\n");
    kprint("Attempting to call...\n");
    asm volatile("call (%0)" : : "r" (&code));

When the inline assembly is called I expect it to run the code from the sectors I read from the "disk" (this is in a VM, because its a hobby OS). What it does instead is it just hangs.

I probably don't much understand how variables, arrays, and assembly work, so if you could fill me in, that would be nice.

EDIT: The data I am reading from the disk is a binary file that was added to the disk image file with

cat kernel.bin >> disk.img

and the kernel.bin is compiled with

i686-elf-ld -o kernel.bin -Ttext 0x4C4B40 *insert .o files here* --oformat binary
Menotdan
  • 130
  • 1
  • 11
  • 1
    I assume your data is actually in binary; hex is a text serialization format for binary. – Peter Cordes May 20 '19 at 02:48
  • 1
    The first problem you want to fix is your exception handlers. They should gather useful information (which exception, which error code, which address was CPU executing when it occurs, etc) and display it, so that you can know more than "it hangs". I'd also recommend getting an emulator and debugger set up. Otherwise you'll be continually fumbling in the dark. – Brendan May 20 '19 at 02:53

1 Answers1

2

What it does instead is it just hangs.

Run your OS inside BOCHS so you can use BOCHS's built-in debugger to see exactly where it's stuck.

Being able to debug lockups, including with interrupts disabled, is probably very useful...


asm volatile("call (%0)" : : "r" (&code)); is unsafe because of missing clobbers.

But even worse than that it will load a new EIP value from the first 4 bytes of the array, instead of setting EIP to that address. (Unless the data you're loading is an array of pointers, not actual machine code?)

You have the %0 in parentheses, so it's an addressing mode. The assembler will warn you about an indirect call without *, but will assemble it like call *(%eax), with EAX = the address of code[0][0]. You actually want a call *%eax or whatever register the compiler chooses, register-indirect not memory-indirect.

&code and code are both just a pointer to the start of the array; &code doesn't create an anonymous pointer object storing the address of another address. &code takes the address of the array as a whole. code in this context "decays" to a pointer to the first object.


https://gcc.gnu.org/wiki/DontUseInlineAsm (for this).

You can get the compiler to emit a call instruction by casting the pointer to a function pointer.

   __builtin___clear_cache(&code[0][0], &code[30][255]);   // don't optimize away stores into the buffer
   void (*fptr)(void) =  (void*)code;                     // casting to void* instead of the actual target type is simpler

   fptr();

That will compile (with optimization enabled) to something like lea 16(%esp), %eax / call *%eax, for 32-bit x86, because your code[][] buffer is an array on the stack.

Or to have it emit a jmp instead, do it at the end of a void function, or return funcptr(); in a non-void function, so the compiler can optimize the call/ret into a jmp tailcall.

If it doesn't return, you can declare it with __attribute__((noreturn)).


Make sure the memory page / segment is executable. (Your uint16_t code[]; is a local, so gcc will allocate it on the stack. This might not be what you want. The size is a compile-time constant so you could make it static, but if you do that for other arrays in other sibling functions (not parent or child), then you lose out on the ability to reuse a big chunk of stack memory for different arrays.)

This is much better than your unsafe inline asm. (You forgot a "memory" clobber, so nothing tells the compiler that your asm actually reads the pointed-to memory). Also, you forgot to declare any register clobbers; presumably the block of code you loaded will have clobbered some registers if it returns, unless it's written to save/restore everything.

In GNU C you do need to use __builtin__clear_cache when casting a data pointer to a function pointer. On x86 it doesn't actually clear any cache, it's telling the compiler that the stores to that memory are not dead because it's going to be read by execution. See How does __builtin___clear_cache work?

Without that, gcc could optimize away the copying into uint16_t code[sectors][256]; because it looks like a dead store. (Just like with your current inline asm which only asks for the pointer in a register.)

As a bonus, this part of your OS becomes portable to other architectures, including ones like ARM without coherent instruction caches where that builtin expands to a actual instructions. (On x86 it purely affects the optimizer).


read(basesector+sector);

It would probably be a good idea for your read function to take a destination pointer to read into, so you don't need to bounce data through your readOut buffer.

Also, I don't see why you'd want to declare your code as a 2D array; sectors are an artifact of how you're doing your disk I/O, not relevant to using the code after it's loaded. The sector-at-a-time thing should only be in the code for the loop that loads the data, not visible in other parts of your program.

char code[sectors * 512]; would be good.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Wow, this is a lot to read. I'm tired rn so ill read it later but it looks very helpful – Menotdan May 20 '19 at 10:45
  • I updated the question above. Also, I declare code as a 2D array because my code returns uint16_t arrays. Would defining it as a ```uint16_t code[sectors*256]``` Be fine? Otherwise, how would I split the uint16_t array so that it fits in a char array? – Menotdan May 20 '19 at 12:01
  • Also, I don't have the builtins available apparently. – Menotdan May 20 '19 at 12:36
  • @Menotdan: I had a typo in my code, I should have copy/pasted `__builtin___clear_cache` because it's tricky: it has three underscores after `__builtin`. – Peter Cordes May 20 '19 at 12:39
  • @Menotdan: at some level your `read` function must just pass an address to a BIOS I/O function or talk to hardware directly; at that point it doesn't matter how its declared in C. Different C types are just different ways of describing a contiguous block of memory. (but beware of strict-aliasing violations if you're casting pointers). You can always memcpy... – Peter Cordes May 20 '19 at 12:42
  • Well, it doesn't load but at least the builtin works now. It still hangs, however. Is there any way to debug with QEMU? You can also check out my code at http://github.com/Menotdan/DripOS – Menotdan May 20 '19 at 14:15
  • @Menotdan: yes, QEMU can be a GDB remote, but GDB doesn't know about segmentation. So BOCHS is *much* better if you need to debug real-mode mode, or the process of entering protected-mode, according to MichaelPetch. I have no idea what mode your code runs in or anything, and that's outside the scope of the question anyway; you were just asking about how to `call` an address from C, which this question answered. Presumably you have other bugs besides this; using a debugger is essential unless you want to add debug-prints everywhere. – Peter Cordes May 20 '19 at 14:22
  • Well I did some debugging with QEMU, and apparently once it calls the kernel, it just jumps to random addresses that don't appear to be valid instructions, I have decided to use GRUB instead, as you might have seen in my other question I asked. I did get it working though so now I have lots of room for code. Right now I'm trying to get my OS to work on real hardware because I have an older PC with IDE drives that would be fun to work with. – Menotdan Sep 06 '19 at 13:23