How to do computations with addresses at compile/linking time?

Question

I wrote some code for initializing the IDT, which stores 32-bit addresses in two non-adjacent 16-bit halves. The IDT can be stored anywhere, and you tell the CPU where by running the LIDT instruction.

This is the code for initializing the table:

void idt_init(void) {
    /* Unfortunately, we can't write this as loops. The first option,
     * initializing the IDT with the addresses, here looping over it, and
     * reinitializing the descriptors didn't work because assigning a
     * a uintptr_t (from (uintptr_t) handler_func) to a descr (a.k.a.
     * uint64_t), according to the compiler, "isn't computable at load
     * time."
     * The second option, storing the addresses as a local array, simply is
     * inefficient (took 0.020ms more when profiling with the "time" command
     * line program!).
     * The third option, storing the addresses as a static local array,
     * consumes too much space (the array will probably never be used again
     * during the whole kernel runtime).
     * But IF my argument against the third option will be invalidated in
     * the future, THEN it's the best option I think. */

    /* Initialize descriptors of exception handlers. */
    idt[EX_DE_VEC] = idt_trap(ex_de);
    idt[EX_DB_VEC] = idt_trap(ex_db);
    idt[EX_NMI_VEC] = idt_trap(ex_nmi);
    idt[EX_BP_VEC] = idt_trap(ex_bp);
    idt[EX_OF_VEC] = idt_trap(ex_of);
    idt[EX_BR_VEC] = idt_trap(ex_br);
    idt[EX_UD_VEC] = idt_trap(ex_ud);
    idt[EX_NM_VEC] = idt_trap(ex_nm);
    idt[EX_DF_VEC] = idt_trap(ex_df);
    idt[9] = idt_trap(ex_res);  /* unused Coprocessor Segment Overrun */
    idt[EX_TS_VEC] = idt_trap(ex_ts);
    idt[EX_NP_VEC] = idt_trap(ex_np);
    idt[EX_SS_VEC] = idt_trap(ex_ss);
    idt[EX_GP_VEC] = idt_trap(ex_gp);
    idt[EX_PF_VEC] = idt_trap(ex_pf);
    idt[15] = idt_trap(ex_res);
    idt[EX_MF_VEC] = idt_trap(ex_mf);
    idt[EX_AC_VEC] = idt_trap(ex_ac);
    idt[EX_MC_VEC] = idt_trap(ex_mc);
    idt[EX_XM_VEC] = idt_trap(ex_xm);
    idt[EX_VE_VEC] = idt_trap(ex_ve);

    /* Initialize descriptors of reserved exceptions.
     * Thankfully we compile with -std=c11, so declarations within
     * for-loops are possible! */
    for (size_t i = 21; i < 32; ++i)
        idt[i] = idt_trap(ex_res);

    /* Initialize descriptors of hardware interrupt handlers (ISRs). */
    idt[INT_8253_VEC] = idt_int(int_8253);
    idt[INT_8042_VEC] = idt_int(int_8042);
    idt[INT_CASC_VEC] = idt_int(int_casc);
    idt[INT_SERIAL2_VEC] = idt_int(int_serial2);
    idt[INT_SERIAL1_VEC] = idt_int(int_serial1);
    idt[INT_PARALL2_VEC] = idt_int(int_parall2);
    idt[INT_FLOPPY_VEC] = idt_int(int_floppy);
    idt[INT_PARALL1_VEC] = idt_int(int_parall1);
    idt[INT_RTC_VEC] = idt_int(int_rtc);
    idt[INT_ACPI_VEC] = idt_int(int_acpi);
    idt[INT_OPEN2_VEC] = idt_int(int_open2);
    idt[INT_OPEN1_VEC] = idt_int(int_open1);
    idt[INT_MOUSE_VEC] = idt_int(int_mouse);
    idt[INT_FPU_VEC] = idt_int(int_fpu);
    idt[INT_PRIM_ATA_VEC] = idt_int(int_prim_ata);
    idt[INT_SEC_ATA_VEC] = idt_int(int_sec_ata);

    for (size_t i = 0x30; i < IDT_SIZE; ++i)
        idt[i] = idt_trap(ex_res);
}

The macros idt_trap and idt_int, and are defined as follows:

#define idt_entry(off, type, priv) \
    ((descr) (uintptr_t) (off) & 0xffff) | ((descr) (KERN_CODE & 0xff) << \
    0x10) | ((descr) ((type) & 0x0f) << 0x28) | ((descr) ((priv) & \
    0x03) << 0x2d) | (descr) 0x800000000000 | \
    ((descr) ((uintptr_t) (off) & 0xffff0000) << 0x30)

#define idt_int(off) idt_entry(off, 0x0e, 0x00)
#define idt_trap(off) idt_entry(off, 0x0f, 0x00)

idt is an array of uint64_t, so these macros are implicitly cast to that type. uintptr_t is the type guaranteed to be capable of holding pointer values as integers and on 32-bit systems usually 32 bits wide. (A 64-bit IDT has 16-byte entries; this code is for 32-bit).

I get the warning that the initializer element is not constant due to the address modification in play.
It is absolutely sure that the address is known at linking time.
Is there anything I can do to make this work? Making the idt array automatic would work but this would require the whole kernel to run in the context of one function and this would be some bad hassle, I think.

I could make this work by some additional work at runtime (as Linux 0.01 also does) but it just annoys me that something technically feasible at linking time is actually infeasible.

How often would such an operation actually give you a predictable and useful/safe result, especially if the address isn't known at compile time? — Dmitri, Jul 11 '15 at 19:26
And what would the linker do with a divided address that isn't a natural machine address? For example, in 32-bit architecture `4 / 2` isn't a valid aligned address. — Weather Vane, Jul 11 '15 at 19:31
@cad: Well, you could trivially achieve your goal with only the tiniest amount of indirection: load the address, then divide it by two. — Kerrek SB, Jul 11 '15 at 19:46
Since the assembler and linker are more or less independent, how can the assembler deal with such an expression without predicting what the linker will do? And if it were somehow passed on to the linker, there'd need to be some sort of standard for how to do that (as the linker doesn't necessarily have knowledge of the original source code or the language used). And what if the actual address isn't determined until *runtime*? — Dmitri, Jul 11 '15 at 19:51
@KerrekSB I set up the question newly, BTW. Could you take a look at it and tell me if it's better now, please? — cadaniluk, Dec 30 '15 at 16:21
@cad: Update the language tags. The code you have looks like perfectly valid C++, though it would be invalid C for the reason you mention. — Kerrek SB, Dec 30 '15 at 22:43
@KerrekSB Done. But why does this compile in C++? I even tried something similar in GNU as and it didn't work. How does C++ solve the problem? — cadaniluk, Dec 30 '15 at 22:47
@Dmitri I don't understand how it would not give you a predictable result at all. And yes, there should be some standard for that. The assembler would assemble it and store the operation on the address somehow in the object file. The linker determines the absolute address and performs the operation on that address. — cadaniluk, Dec 30 '15 at 22:51
@WeatherVane The result isn't used for addressing memory. It is stored as a value instead, so I don't think this will be problematic. — cadaniluk, Dec 30 '15 at 22:52
@cad: C++ has a more expressive notion of "compile-time constant expression", so you can say `const int a = 2; const int b = a / 2;`, and everything is determined statically. On top of that, C++ permits dynamic initialization of global variables. C allows neither. — Kerrek SB, Dec 31 '15 at 01:02
@KerrekSB: Remember, this binary is going to be loaded by a bootloader as a kernel. There won't be CRT startup functions to call constructors for globals unless the OP writes some code to call those hooks. I think this is still going to be a problem: the addresses won't be resolved until link time, and AFAIK, normal object-file formats don't support filling in "top half of 's address". Some custom processing would be needed to produce a binary with a static chunk of data whose address you can just pass to `LIDT`. This could happen at build-time, but still custom. — Peter Cordes, Dec 31 '15 at 01:49
My apologies for being dense, but *where* exactly is the compile time initialization happening? The code seems to be initializing the array at run time in the function `idt_init()`. No? — Ziffusion, Dec 31 '15 at 02:00
@Ziffusion: yup, I was confused at first, too. The OP is hoping for a way to write this to initialize a global, but as written it's inside a function. — Peter Cordes, Dec 31 '15 at 02:27
@Ziffusion It's supposed to be a small kernel, just for my own use. No RTOS, no Linux. Just an x86 kernel. — cadaniluk, Jan 02 '16 at 12:30
@cad I have an answer below which should allow you to use compile time constant addresses to initialize the `IDT`. But tell me, what is your argument against initializing this at the run time? After all, you have to use the `LIDT` instruction at run time anyways. What's the problem with initializing `IDT` right before that? — Ziffusion, Jan 02 '16 at 16:37
@Ziffusion Just that something theoretically feasible at compile time is done at runtime. I know that it wouldn't be drastically inefficient and all but that's just... disappointing. — cadaniluk, Jan 02 '16 at 16:45
Have you looked at my answer below? Isn't that what you are looking for? I thought that was what you were asking. — Ziffusion, Jan 02 '16 at 16:49
@Ziffusion Yes, yes, may be, I'll definitely look at all answers. I hadn't enough time for it yet. — cadaniluk, Jan 02 '16 at 16:51
Ah sure. Just curious if this is in fact what you were asking.Or otherwise why it would not work for you. — Ziffusion, Jan 02 '16 at 16:57
Howdy, I realize this question is a few years old, but you and this question seem to be active. May I ask, do you use a linker script when building your kernel/OS? — Michael Petch, Oct 02 '19 at 19:18
@MichaelPetch Yup, did use one. I'd have to dig pretty deep to find it, though, if you want further information. — cadaniluk, Oct 03 '19 at 12:50
I'm asking the questions to see how much of a fit this question is and the answer I have for this related question: https://stackoverflow.com/questions/58192042/solution-needed-for-building-a-static-idt-and-gdt-at-assemble-compile-link-time . I'm considering writing a solution here for your answer that demonstrates the technique used in the answer to the other question. I could show how that solution can easily be adapted for your GCC code.It is a solution that doesn't require runtime code to fix up an IDT or GDT. The solution I propose uses the linker to generate the tables. — Michael Petch, Oct 03 '19 at 12:57
One last question. What was the value for IDT_SIZE. Had me curious if it was necessary to set all the unused exceptions after 0x30 to `ex_res`. Usually if someone uses/access an interrupt outside the IDT a GPF will be raised and the error code will be nonzero which includes whether the descriptor was in the IDT/GDT/LDT and what selector number caused the fault. From that you can decide to do something telling the user about an invalid selector. — Michael Petch, Oct 03 '19 at 15:42
@MichaelPetch I did not use multiboot and `IDT_SIZE` was set to `256`. — cadaniluk, Oct 03 '19 at 16:18
So then I gather you wrote some kind of custom bootloader, read your kernel from disk, enabled A20, entered protected mode and then started executing the kernel's entry point in C? — Michael Petch, Oct 03 '19 at 16:28

Peter Cordes · Accepted Answer · 2023-02-20T18:28:10.950

Related: Solution needed for building a static IDT and GDT at assemble/compile/link time - a linker script for ld can shift and mask to break up link-time-constant addresses. No earlier step has the final addresses, and relocation entries are limited in what they can represent in a .o.

The main problem is that function addresses are link-time constants, not strictly compile time constants. The compiler can't just get 32b binary integers and stick that into the data segment in two separate pieces. Instead, it has to use the object file format to indicate to the linker where it should fill in the final value (+ offset) of which symbol when linking is done. The common cases are as an immediate operand to an instruction, a displacement in an effective address, or a value in the data section. (But in all those cases it's still just filling in 32-bit absolute address so all 3 use the same ELF relocation type. There's a different relocation for relative displacements for jump / call offsets.)

It would have been possible for ELF to have been designed to store a symbol reference to be substituted at link time with a complex function of an address (or at least high / low halves like on MIPS for lui $t0, %hi(symbol) / ori $t0, $t0, %lo(symbol) to build address constants from two 16-bit immediates). But in fact the only function allowed is addition/subtraction, for use in things like mov eax, [ext_symbol + 16].

It is of course possible for your OS kernel binary to have a static IDT with fully resolved addresses at build time, so all you need to do at runtime is execute a single lidt instruction. However, the standard build toolchain is an obstacle. You probably can't achieve this without post-processing your executable.

e.g. you could write it this way, to produce a table with the full padding in the final binary, so the data can be shuffled in-place:

#include <stdint.h>

#define PACKED __attribute__((packed))

// Note, this is the 32-bit format.  64-bit is larger    
typedef union idt_entry {

    // we will postprocess the linker output to have this format
    // (or convert at runtime)
    struct PACKED runtime {   // from OSdev wiki
       uint16_t offset_1; // offset bits 0..15
       uint16_t selector; // a code segment selector in GDT or LDT
       uint8_t zero;      // unused, set to 0
       uint8_t type_attr; // type and attributes, see below
       uint16_t offset_2; // offset bits 16..31
    } rt;

    // linker output will be in this format
    struct PACKED compiletime {
       void *ptr; // offset bits 0..31
       uint8_t zero;
       uint8_t type_attr;
       uint16_t selector; // to be swapped with the high16 of ptr
    } ct;
} idt_entry;

// #define idt_ct_entry(off, type, priv) { .ptr = off, .type_attr = type, .selector = priv }
#define idt_ct_trap(off) { .ct = { .ptr = off, .type_attr = 0x0f, .selector = 0x00 } }
// generate an entry in compile-time format

extern void ex_de();  // these are the raw interrupt handlers, written in ASM
extern void ex_db();  // they have to save/restore *all* registers, and end with  iret, rather than the usual C ABI.

// it might be easier to use asm macros to create this static data, 
// just so it can be in the same file and you don't need cross-file prototypes / declarations
// (but all the same limitations about link-time constants apply)
static idt_entry idt[] = {
    idt_ct_trap(ex_de),
    idt_ct_trap(ex_db),
    // ...
};

// having this static probably takes less space than instructions to write it on the fly
// but not much more.  It would be easy to make a lidt function that took a struct pointer.
static const struct PACKED  idt_ptr {
  uint16_t len;  // encoded as bytes - 1, so 0xffff means 65536
  void *ptr;
} idt_ptr = { sizeof(idt) - 1, idt };


/****** functions *********/

// inline
void load_static_idt(void) {
  asm volatile ("lidt  %0"
               : // no outputs
               : "m" (idt_ptr));
  // memory operand, instead of writing the addressing mode ourself, allows a RIP-relative addressing mode in 64bit mode
  // also allows it to work with -masm=intel or not.
}

// Do this once at at run-time
// **OR** run this to pre-process the binary, after link time, as part of your build
void idt_convert_to_runtime(void) {
#ifdef DEBUG
  static char already_done = 0;  // make sure this only runs once
  if (already_done)
    error;
  already_done = 1;
#endif
  const int count = sizeof idt / sizeof idt[0];
  for (int i=0 ; i<count ; i++) {
    uint16_t tmp1 = idt[i].rt.selector;
    uint16_t tmp2 = idt[i].rt.offset_2;
    idt[i].rt.offset_2 = tmp1;
    idt[i].rt.selector = tmp2;
    // or do this swap in fewer insns with SSE or MMX pshufw, but using vector instructions before setting up the IDT may be insane.
  }
}

This does compile. See a diff of the -m32 and -m64 asm output on the Godbolt compiler explorer. Look at the layout in the data section (note that .value is a synonym for .short, and is 16 bits.) (But note that the IDT table format is different for 64-bit mode.)

I think I have the size calculation correct (bytes - 1), as documented in http://wiki.osdev.org/Interrupt_Descriptor_Table. Minimum value 100h bytes long (encoded as 0x99). See also https://en.wikibooks.org/wiki/X86_Assembly/Global_Descriptor_Table. (lgdt size/pointer works the same way, although the table itself has a different format.)

The other option, instead of having the IDT static in the data section, is to have it in the bss section, with the data stored as immediate constants in a function that will initialize it (or in an array read by that function).

Either way, that function (and its data) can be in a .init section whose memory you re-use after it's done. (Linux does this to reclaim memory from code and data that's only needed once, at startup.) This would give you the optimal tradeoff of small binary size (since 32b addresses are smaller than 64b IDT entries), and no runtime memory wasted on code to set up the IDT. A small loop that runs once at startup is negligible CPU time. (The version on Godbolt fully unrolls because I only have 2 entries, and it embeds the address into each instruction as a 32-bit immediate, even with -Os. With a large enough table (just copy/paste to duplicate a line) you get a compact loop even at -O3. The threshold is lower for -Os.)

Without memory-reuse haxx, probably a tight loop to rewrite 64b entries in place is the way to go. Doing it at build time would be even better, but then you'd need a custom tool to run the tranformation on the kernel binary.

Having the data stored in immediates sounds good in theory, but the code for each entry would probably total more than 64b, because it couldn't loop. The code to split an address into two would have to be fully unrolled (or placed in a function and called). Even if you had a loop to store all the same-for-multiple-entries stuff, each pointer would need a mov r32, imm32 to get the address in a register, then mov word [idt+i + 0], ax / shr eax, 16 / mov word [idt+i + 6], ax. That's a lot of machine-code bytes.

I updated the question. Could you take a look at it again and update your answer according to the changes I made in my question, please? I started a tiny bounty, so... ;-) — cadaniluk, Dec 30 '15 at 16:36
Nice explanation, multiple approaches, with one very sweet `union` solution (dammit, I always forget about them)... have it all. — cadaniluk, Jan 06 '16 at 13:10
`sizeof idt / sizeof idt[0]` is not quite right. This would prevent the maximum number of descriptors being placed into the IDT (if it got that large). To avoid this problem it is assumed a true size 0 doesn't exist. The length should have 1 subtracted from it. This also applies to the structure loaded into the GDTR by LGDT as well. Size is the size minus 1 so that a maximum value of 65536 can be represented. — Michael Petch, Sep 02 '17 at 22:11
@MichaelPetch: thanks, updated it to `max_index` instead of `len`. — Peter Cordes, Sep 02 '17 at 22:27

score 4 · Answer 2 · answered Dec 31 '15 at 19:56

One way would be to use an intermediate jump table that is located at a fixed address. You could initialize the idt with addresses of the locations in this table (which will be compile time constant). The locations in the jump table will contain jump instructions to the actual isr routines.

The dispatch to an isr will be indirect as follows:

trap -> jump to intermediate address in the idt -> jump to isr

One way to create a jump table at a fixed address is as follows.

Step 1: Put jump table in a section

// this is a jump table at a fixed address
void jump(void) __attribute__((section(".si.idt")));

void jump(void) {
    asm("jmp isr0"); // can also be asm("call ...") depending on need
    asm("jmp isr1");
    asm("jmp isr2");
}

Step 2: Instruct the linker to locate the section at a fixed address

SECTIONS
{
  .so.idt 0x600000 :
  {
    *(.si.idt)
  }
}

Put this in the linker script right after the .text section. This will make sure that the executable code in the table will go into a executable memory region.

You can instruct the linker to use your script as follows using the --script option in the Makefile.

LDFLAGS += -Wl,--script=my_script.lds

The following macro gives you the address of the location which contains the jump (or call) instruction to the corresponding isr.

// initialize the idt at compile time with const values
// you can find a cleaner way to generate offsets
#define JUMP_ADDR(off)  ((char*)0x600000 + 4 + (off * 5))

You will then initialize the idt as follows using modified macros.

// your real idt will be initialized as follows

#define idt_entry(addr, type, priv) \
    ( \
        ((descr) (uintptr_t) (addr) & 0xffff) | \
        ((descr) (KERN_CODE & 0xff) << 0x10) | \
        ((descr) ((type) & 0x0f) << 0x28) | \
        ((descr) ((priv) & 0x03) << 0x2d) | \
        ((descr) 0x1 << 0x2F) | \
        ((descr) ((uintptr_t) (addr) & 0xffff0000) << 0x30) \
    )

#define idt_int(off)    idt_entry(JUMP_ADDR(off), 0x0e, 0x00)
#define idt_trap(off)   idt_entry(JUMP_ADDR(off), 0x0f, 0x00)

descr idt[] =
{
    ...
    idt_trap(ex_de),
    ...
    idt_int(int_casc),
    ...
};

A demo working example is below, which shows dispatch to a isr with a non-fixed address from a instruction at a fixed address.

#include <stdio.h>

// dummy isrs for demo
void isr0(void) {
    printf("==== isr0\n");
}

void isr1(void) {
    printf("==== isr1\n");
}

void isr2(void) {
    printf("==== isr2\n");
}

// this is a jump table at a fixed address
void jump(void) __attribute__((section(".si.idt")));

void jump(void) {
    asm("jmp isr0"); // can be asm("call ...")
    asm("jmp isr1");
    asm("jmp isr2");
}

// initialize the idt at compile time with const values
// you can find a cleaner way to generate offsets
#define JUMP_ADDR(off)  ((char*)0x600000 + 4 + (off * 5))

// dummy idt for demo
// see below for the real idt
char* idt[] =
{
    JUMP_ADDR(0),
    JUMP_ADDR(1),
    JUMP_ADDR(2),
};

int main(int argc, char* argv[]) {
    int trap;
    char* addr = idt[trap = argc - 1];
    printf("==== idt[%d]=%p\n", trap, addr);
    asm("jmp *%0\n" : :"m"(addr));
}

Nice idea, +1. Technically applicable, but I don't like this extra level of indirection. And I don't know if this mechanism will be good for future maintenance. — cadaniluk, Jan 06 '16 at 13:01
in `main`, you could just declare `addr` as a function pointer, and dereference it. Tail-call optimization should turn it into an indirect jump. (except it won't because `main` has to return 0 if there's no explicit return.) You could at least use an `"rm"` constraint, to allow a register-indirect jump. Interesting idea, though, to add a level of indirection through a jump table that lives at a constant address, so the IDT entries can truly be compile-time constants. — Peter Cordes, Jan 18 '16 at 04:35

How to do computations with addresses at compile/linking time?

2 Answers2

Linked

Related