Assembly page fault handler cannot be called due to invalid stack pointer

Question

When my page fault handler interrupt gets called (it is supposed to hang the system), there are some variables pushed to the stack before it is called. I have virtual memory enabled and when I set up an invalid stack pointer (esp) and the int14 handler gets called it immediately causes another page fault and so on and so on. How should I resolve this situation?

My int14 code:

isr14:
    ; interrupt handler for isr14
    jmp $
    iretd

The code that causes it to break:

mov esp, 0x1000 ; 0x1000 is not mapped in the VM directory
push dword 'A'
jmp $

Section of my IDT table:

irq14:
    dw isr14
    dw 0x0008
    db 0x00
    db 10101110b
    dw 0x0000

irq15:
........

I'm not super familiar with this, but I recall that the solution is to have a separate safe stack the interrupt descriptor switches to when handling page faults. — fuz, Mar 14 '20 at 21:23
I get that but how am I supposed to switch to that stack before the interrupt is called? — DrCarnivore, Mar 14 '20 at 21:24
I'm a bit fuzzy about the details but I recall that you can encode this somewhere in the interrupt descriptor. Let me check the documentation. — fuz, Mar 14 '20 at 21:26
Define a stack to switch to in the TSS. Then set up an interrupt gate with a nonzero ist field referring to the stack segment/pointer saved in your TSS. This should solve the problem. — fuz, Mar 14 '20 at 21:29
If you managed to solve your program this way, please write up an answer! — fuz, Mar 14 '20 at 22:40
@fuz: I seem to recall Michael Petch posting an answer with that. Maybe just the fact that TSS is the mechanism that allows kernels to handle interrupts on a kernel stack, not letting user-space crash or own the machine. e.g. [Creating a proper Task State Segment (TSS) structure with and without an IO Bitmap?](https://stackoverflow.com/q/54876039) / [During an x86 software interrupt, when exactly is a context switch made?](https://stackoverflow.com/q/38064966). https://wiki.osdev.org/Getting_to_Ring_3 looks relevant. — Peter Cordes, Mar 14 '20 at 23:14

Brendan · Accepted Answer · 2020-03-15T07:25:40.463

5

How should I resolve this situation?

I'd resolve the situation by using avoidance - don't let kernel have a dodgy stack pointer in the first place (and don't let kernel stack be sent to swap space, don't use page fault for "auto-growing kernel stack", etc). Note that CPU will automatically switch to kernel stack if a page fault happens in user-space (at CPL=3) so it doesn't matter if user-space has a dodgy stack pointer.

Alternatives are:

force a kernel stack switch when kernel code (CPL=0) causes a page fault. This can be done using hardware task switch (protected mode) or the IST mechanism (long mode) for the page fault exception handler. This would be the best option for recovery (e.g. makes it easier to figure out what the problem was, fix it, then return).
force a kernel stack switch when kernel code (CPL=0) causes a double fault. This can be done using hardware task switch (protected mode) or the IST mechanism (long mode) for the double fault exception handler. This would be the best option for performance (no added overhead for normal page faults).

Note 1: Be warned that neither hardware task switching/task gates nor IST are re-entrant. For hardware task switching, if a second page fault occurs while you're handling the first page fault you'll get a general protection fault (because the "page fault task" is busy); and for IST, if a second page fault occurs while you're handling the first page fault the second page fault will trash/overwrite the first page fault's stack and make it impossible to recover. In theory, you can mitigate these problems by switching to a different task or different stack as soon as possible, but that's complicated/messy and likely to cause even more problems.

Note 2: You'll probably end up with a combination of avoidance and double fault using hardware task switch or IST; with the double fault handler doing "freeze system and dump info/panic" as a generic fallback for catastrophic kernel failures (that were supposed to be avoided but weren't).

Note 3: If you want to support "auto-growing kernel stacks"; you can use "stack probes" instead - basically, just do dummy read/s (in function epilogues) from "future stack" before using the memory for stack, so that the page fault occurs when there's still enough kernel stack left for the page fault handler.

edited Mar 15 '20 at 07:25

answered Mar 15 '20 at 07:08

Brendan

35,656
2
39
66

Thanks! I read that having a TSS (which I havent set up yet, I do task switching without it) would make the processor switch to a kernel stack when an interrupt happens. I will try to do that. – DrCarnivore Mar 15 '20 at 09:50
Or do I have to have a TSS for that? Is it enought if I just switch rings before entering a faulty process? – DrCarnivore Mar 15 '20 at 09:50
Notte that the IST mechanism can also be used in 16 and 32 bit protected mode. It might even be easier to use than a task gate. – fuz Mar 15 '20 at 10:07
how should I set up my TSS? I created an entry but how do I add it to GDT and then load it so that the CPU knows esp0 when switching rings? – DrCarnivore Mar 15 '20 at 10:18
@DrCarnivore: For protected mode; if you use other privilege levels (e.g. CPL=3) you need at least one TSS to handle privilege level changes (because that's where CPU will load "SS:ESP" from when changing to a "more privileged" privilege level - e.g. from CPL=3 to CPL=0 because of an interrupt or system call). Typically you end up with several hardware tasks per CPU - one main hardware task (that uses software task switching to switch between normal tasks), then more special purpose hardware tasks that don't do software task switching (for double fault, NMI and/or for machine check exception). – Brendan Mar 16 '20 at 08:54
@fuz: No, you can't use IST in protected mode. You can run 32-bit code in long mode (and can use IST in long mode), but it's awkward for a kernel to do that (the design of long mode expects kernel to be 64-bit so you'd end up with some "thunking" at kernel entry points). – Brendan Mar 16 '20 at 08:57
@Brendan Huch? But the same feature is available for 32 bit interrupt descriptors and a similar table exists in the 32 bit TSS. Or have I misunderstood how this works? – fuz Mar 16 '20 at 08:59
@fuz: The closest thing to IST (in long mode) is hardware task switching (in protected mode). Specifically, when AMD decided hardware task switching won't be supported in long mode they had to invent a viable replacement for use in critical exception handlers (e.g. double fault) to ensure a switch to a "known good" stack, so they invented IST. It wasn't "back-ported" from long mode to protected mode (there's simply no space in TSS for a new table, and no reason to justify the effort given that protected mode is mostly for backward compatibility with old stuff that won't support new features). – Brendan Mar 16 '20 at 09:24

Assembly page fault handler cannot be called due to invalid stack pointer

1 Answers1