3

I am attempting to define a constant IDT (Interrupt Descriptor Table) entry in NASM, and to do so, I need to emit into a data table the high word of a double-word address that is not resolved until link time. Is there a way to do it?

Here's the interrupt handler:

;;; Interrupt 3 (breakpoint) handler.  For now, just poke the screen and halt.

        align   8
int3:
        mov     [0xb8000],dword '* * '
        hlt

And here's the IDT entry that references it. The most-significant and least-significant words of the offset need to be stored separately and non-contiguously:

        ;; Interrupt 3 - breakpoint
        dw      int3                    ; offset (low)    <---- WORKS
        dw      codesel                 ; code selector
        db      0                       ; unused
        db      0b10001111              ; present, ring 0, 32-bit trap gate
        dw      int3 >> 16              ; offset (high)   <---- ASSEMBLY ERROR

NASM correctly causes LD to emit the low word of int3's address, but the high word fails at assembly time with this error:

pgm.asm:240: error: shift operator may only be applied to scalar values

NASM won't do math with a value that isn't defined until link time. I understand, but I need a way to work around this. I could:

  • locate int3 absolutely
  • Build the IDT at runtime instead of assembly time

I'll probably end up building the IDT at runtime, but it'd be good to know if there is a way to cause the assembler/linker to emit into a data table the high word of an address that is not resolved until link time.


Particulars:

  • NASM 2.20.011
  • NASM output format aout
  • LD version 2.22
  • 32-bit mode (NASM "bits 32" directive issued)

1 This is probably a typo; the latest version in my distro today is 2.12.01. The latest version of nasm available at the time I wrote this question was 2.10.01.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Wayne Conrad
  • 103,207
  • 26
  • 155
  • 191
  • 1
    See also: http://stackoverflow.com/questions/12861843/statically-defined-idt?rq=1 . Strictly speaking, this question is a duplicate of that one -- it's the same problem, with the same cause (ld + swizzled IDT) but different languages. – Wayne Conrad May 03 '13 at 12:43
  • The NASM version was probably 2.10.01 or such, there is no 2.20 series yet. The version 2.10.01 was released on at 2012-05-25 01:00 +0400, https://repo.or.cz/nasm.git/commitdiff/3d1d159e1c876308712fd5e21089dfddfbad1e69 – ecm Sep 30 '19 at 19:46
  • @ecm It must have been a typo. I'll add a note, thanks. – Wayne Conrad Sep 30 '19 at 20:48
  • 1
    Related: a C version of the same problem: [How to do computations with addresses at compile/linking time?](//stackoverflow.com/q/31360888). ELF doesn't have a relocation for this so you're basically screwed; you could have the OS do fixups after loading itself. – Peter Cordes Oct 01 '19 at 03:39
  • 1
    I know this question is over 6 years old, but I recently wrote an answer to a related question that offers up a solution by building IDT and GDTs in a linker script (and the C pre-processor): https://stackoverflow.com/a/58192043/3857942. This method has them built at link time. – Michael Petch Oct 01 '19 at 23:23
  • Yes it would apply here, my answer there shows a NASM specific example as well so it is far more closely matched to this question. That answer though can be used for code generated by C/C++/rust etc since the work is being done by the linker script. If you provided a full piece of code in your question (making it an [mcve]. Even if it involves NASM and GCC or other language, one could create a specific answer to solve your problem,, but I feel like the other question/answer may be a very close duplicate of this. I realize after 6 years you may no longer have an example on hand. – Michael Petch Oct 02 '19 at 17:48
  • 1
    @MichaelPetch There's something particularly satisfying in voting to close your own question in favor of a superior one. Thanks! – Wayne Conrad Oct 02 '19 at 18:33
  • 1
    It isn't superior IMHO. What I did though is rather than ask about the problem, asked for actual solutions. I was hoping that initial question/answer might entice others to provide their own mechanisms and create new answers. It just so happened yours became a duplicate. It is good though that we can clean up the OSDev tag a bit at the same time. – Michael Petch Oct 02 '19 at 18:37
  • 1
    @MichaelPetch I know you did, and thank you. I wasn't being sarcastic--I think your q/a is very good, and I'm glad for mine to be a dup of it. – Wayne Conrad Oct 02 '19 at 19:05
  • 1
    Didn't think there was sarcasm at all! – Michael Petch Oct 02 '19 at 19:08

1 Answers1

3

Well... as you probably know, Nasm will condescend to do a shift on the difference between two labels. The usual construct is something like:

dw (int3 - $$) >> 16

where $$ refers to the beginning of the section. This calculates the "file offset". This is probably not the value you want to shift.

dw (int3 - $$ + ORIGIN) >> 16

may do what you want... where ORIGIN is... well, what we told Nasm for org, if we were using flat binary. I ASSume you're assembling to -f elf32 or -f elf64, telling ld --oformat=binary, and telling ld either in a linker script or on the command line where you want .text to be (?). This seems to work. I made an interesting discovery: if you tell ld -oformat=binary (one hyphen) instead of --oformat=binary (two hyphens), ld silently outputs nothing! Don't do this - you waste a lot of time!

Frank Kotler
  • 3,079
  • 2
  • 14
  • 9
  • 1
    Thanks for the answer. I'm assembling to "-f aout", but "-f elf32" doesn't change anything: either way, ORG is rejected by nasm since I'm not generating a flat binary. I guess that points to the "absolutely locate (part of) my program" workaround, if I don't want to build the IDT at runtime. Re "Don't do this": :) – Wayne Conrad May 03 '13 at 12:32
  • `ld` has a lot of one-letter options, including `-o output_filename`. So `ld -o 'format=binary'` is how it parses with only one dash. – Peter Cordes Oct 01 '19 at 03:37