2

I'm writing a small x86-64 kernel. I am setting up the IDT and ran into a strange error I don't quite understand. Here's a minimal example:

entry.s

extern InterruptHandler
global isr0
align 4

isr0:
jmp sharedisr

sharedisr:
push rax
push rbx
push rcx
push rdx
push rdi
push rsi
push r8
push r9
push r10
push r11
push r12
push r13
push r14
push r15
cld

call InterruptHandler

pop r15
pop r14
pop r13
pop r12
pop r11
pop r10
pop r9
pop r8
pop rsi
pop rdi
pop rdx
pop rcx
pop rbx
pop rax
iretq

idt.cpp

#include "idt.h"

struct IDTEntry{
    unsigned short offset0;     //bits 0-15
    unsigned short selector;    //0x08 = 1000b code selector
    unsigned char ist;      //0
    unsigned char attrib;       //
    unsigned short offset1;     //bits 16-31
    unsigned int offset2;   //bits 31-63
    unsigned int zero;      //0
}__attribute__((packed));

struct IDTR{
    unsigned short size;
    unsigned long address;
}__attribute__((packed));

unsigned long isrAddresses[1];
IDTEntry idt[1];

extern "C" void InterruptHandler(){
    
}

void SetupIDT(){
    isrAddresses[0] = {
        (unsigned long)&isr0
    };
    
    IDTEntry entry;
    for (unsigned int i = 0; i < 1; i++){
        entry.offset0 = (unsigned short)isrAddresses[i];
        entry.selector = 0x08;
        entry.ist = 0;
        entry.attrib = 0x8e;
        entry.offset1 = (unsigned short)(isrAddresses[i] >> 16);
        entry.offset2 = (unsigned int)(isrAddresses[i] >> 32);
        entry.zero = 0;
        idt[i] = entry;
    }
    
    unsigned long idtAddr = (unsigned long)idt;
    IDTR idtr = { 4096, idtAddr };
    unsigned long idtrAddr = (unsigned long)&idtr;
    asm volatile("lidt (%0)" : : "r"(idtrAddr));
}

idt.h

#ifndef IDTH
#define IDTH

extern "C" void isr0(void);
void SetupIDT();

#endif

main.cpp

#include "idt.h"

void main(){
    SetupIDT();
    asm volatile("hlt");
}

I compile and link with this bash script:

nasm entry.s -felf64 -oentry.o
g++ -static -ffreestanding -nostdlib -mgeneral-regs-only -mno-red-zone -c -m64 main.cpp -omain.o
g++ -static -ffreestanding -nostdlib -mgeneral-regs-only -mno-red-zone -c -m64 idt.cpp -oidt.o
ld -entry main --oformat elf64-x86-64 --no-dynamic-linker -static -nostdlib -Ttext-segment=ffff800001000000 entry.o main.o idt.o  -okernel.elf

I'm getting the error ld: failed to convert GOTPCREL relocation; relink with --no-relax. If I add the --no-relax option to ld it works and the code actually works. I can print something in the InterruptHandler function and return to the previous code which triggered the exception (I tested it works with a division by zero provoked on purpose).

From the error I guess it has something to do with linking relaxation but I don't know what it is. Can someone explain in brief what it is and why I am getting the error? Also, can anyone give any further advice on building a proper ISR. For example, should I push RBP then put RSP in RBP like what gcc does when using __attribute__((interrupt))? Should I also have a leaveq at the end of the ISR? etc...

Thank you for any tip!

user123
  • 2,510
  • 2
  • 6
  • 20
  • `-fPIE` is probably the default for your g++, and you didn't use `-fno-pie`, so calls to unknown functions (not in the same file or elf visibility=hidden) are probably emitted as `call SetupIDT@plt`, which the linker then has to relax to just a direct call not through the PLT when it finds that symbol in another file being linked non-dynamically. (See the bottom of [this answer](https://stackoverflow.com/a/52131094/224132) for examples when building a non-freestanding Linux executable). IDK exactly why that would be a problem for the linker to relax, though. – Peter Cordes Aug 20 '21 at 14:20
  • Also note that you probably want `1: hlt; jmp 1b` in your asm, or `while(1) asm("hlt");`. hlt only sleeps until the next interrupt. – Peter Cordes Aug 20 '21 at 14:22
  • Does -fno-pie replace -ffreestanding? I get a lot of relocation truncated errors now like if the bss is too big to be reached using RIP relative addressing. – user123 Aug 20 '21 at 14:42
  • No, `-fno-pie` is orthogonal to `-ffreestanding`. With `-fno-pie`, the default code model assumes it will be linked to a virtual address in the low 4GiB. If that's not the case, you might need `-mcmodel=kernel` to I think assume the high 2GiB (e.g. so absolute addresses can be used as sign-extended 32-bit immediates, but not zero-extended. With just `-fno-pie`, GCC will use `mov $symbol, %edi` to put a symbol address into a register: [How to load address of function or label into register](https://stackoverflow.com/q/57212012)) – Peter Cordes Aug 20 '21 at 14:43
  • BTW, `asm volatile("lidt (%0)" : : "r"(idtrAddr));` would probably be better done as `asm volatile("lidt %0" : : "m"(*idtrAddr));` to let the compiler pick an addressing mode instead of asking it to load an address into a register. – Peter Cordes Aug 20 '21 at 14:44
  • Yes I'll do that instead. Thank you. Actually, the kernel is loaded above oxffff800000000000 in the virtual address space. This address space is mapped from 0 to 4GB statically in the kernel at the current moment. BSS is thus quite big. Does -fno-pie prevent gcc from using RIP relative addressing? – user123 Aug 20 '21 at 14:49
  • The code + BSS is definitely smaller than 2GB. I think the issue here is that, when I use -fno-pie, the linker stops using RIP relative addressing to address the static variables in the BSS/data segment. It uses absolute addresses that a 32 bits imm cannot contain completely. It thus creates a lot of relocation errors. – user123 Aug 20 '21 at 15:11
  • Ok, so near the bottom of the upper half of the canonical address space. So you still couldn't use `[array + rdi*4]` addressing modes which sign-extend a 32-bit absolute address. No, `-fno-pie` doesn't *prevent* GCC from using RIP-relative addressing, but with the default (small) code model GCC thinks it can use 32-bit absolute addresses when that would be more efficient. See the link in my previous comment re: putting an address into a register. – Peter Cordes Aug 20 '21 at 15:16
  • GCC will always use RIP-relative to load or store to a non-array global variable (unless you use a large or huge code model which would imply that data can be more than 2GiB away from code so RIP-relative might not reach. [How can I allocate more than 2GB of memory in gas assembler?](https://stackoverflow.com/q/47052002)) Look at compiler output from `gcc -S` or on https://godbolt.org/ for various combos of options. – Peter Cordes Aug 20 '21 at 15:17
  • Yes there are definitely static arrays like the page tables and stuff. Those would thus be accessed using 32 bits absolute addresses? If that's the case should I use a large model? What does Linux do actually? – user123 Aug 20 '21 at 15:28
  • I think Linux loads the kernel to ffff_ffff_8000_0000 which is basically a sign- extended 32 bits address. That would make sense if gcc supports sign extending to 64 bits the 32 bits absolute addresses. – user123 Aug 20 '21 at 15:34
  • It's not *GCC* that supports stuff, it's x86-64 machine code. `[disp32 + rdi*4]` is what you want for a run-time variable array index, but that's absolute, not PC-relative. For a constant array index, it would resolve to an address with no registers, so it could use RIP-relative. See [32-bit absolute addresses no longer allowed in x86-64 Linux?](https://stackoverflow.com/q/43367427). (In position-independent code, GCC needs to `lea rsi, [rel array]` / `mov eax, [rsi + rdi*4]` instead of just `mov eax, [array + rdi*4]`, so that's what it does with `-fpie` or maybe some mcmodel= options) – Peter Cordes Aug 20 '21 at 15:39
  • Ok I'll try to look into all that. It's a lot of information at once so I have a hard time to follow. Thank you again though. I think I'll stick to --no-relax for now as I actually looked in memory and I have the right addresses there and everything seems to work. I'll still look into all that and try to look into the actual implications of using --no-relax. I simply don't know what it is at all (but it works). – user123 Aug 20 '21 at 15:52

0 Answers0