3

I have a small hobby OS I boot with UEFI. I set up the Intel's xHC to trigger interrupts using MSI-X and then I reset all root hub ports which triggers 2 Port Status Change Events and one interrupt. This interrupt is doing nothing for now. It simply attempts to come back to the normal flow of execution using iretq in inline assembly.

This was failing until I had the idea of looking onto the stack for what was making it fail. I found out that something seems to be pushed on the stack by the MSI-X functionality of the xHC. I simply had to do one pop operation or increment RSP by 8 to make it work.

My questions are:

  1. What is it that the xHC pushes on the stack?

  2. Where is it documented in the specification? Is it in the xHCI spec or the PCI spec? (I don't have access to the latter).

  3. Is it a conventional thing that must be pushed by all PCI devices or is it specific to the Intel's xHC?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user123
  • 2,510
  • 2
  • 6
  • 20
  • To answer your question more specifically: neither the PCIe spec nor the xHCI spec has any say about what the CPU does when it receives an interrupt. That is completely described by the SDM. – prl Jun 30 '21 at 20:51
  • I do use a different handler for each vector. I have a printf function which prints the type of handler that have been called like gpf for a gp. The handler called in that case is handler 48 which has nothing to do with exceptions. It prints "hci" on the screen. It then increments rsp by 8 and uses iretq to come back to the normal flow of execution. There doesn't seem to be a fault since I don't get anything printed on the screen other than hci. At the "root" of the kernel where the stack is equal to what I set it to there is also the word ready printed just before halting. – user123 Jun 30 '21 at 20:56
  • The word ready is printed properly. It means that handler 48 is called and then the flow of execution comes back through a chain of function calls and then ready is printed like I would expect at the root of the kernel. – user123 Jun 30 '21 at 20:58
  • Yes, I also thought I should look at the developers manual to know what is pushed on the stack. – user123 Jun 30 '21 at 21:00
  • What value is pushed exactly? As prl said, there should not be any extra QWORD. MSI-Xs are just translated to APIC messages by the PCI root complex/system agent, so they are equivalent to normal IPIs (which don't push extra data). Are you sure it's not `printf` that leaves something on the stack? – Margaret Bloom Jun 30 '21 at 21:26
  • It is definitely not printf as, when I remove it, it still jumps to the gpf handler. I get a general protection fault on the iretq instruction of after because what is pushed on the stack isn't the right return address. – user123 Jul 01 '21 at 05:31
  • In the handler I have a stack which is 160 bytes deep like there was a lot of functions called. Actually, there is only 2. The ports are reset at a depth of maybe 2 from the root of the kernel. The stack doesn't seem consistent with execution flow. Especially, that functions don't have arguments. – user123 Jul 01 '21 at 05:34

1 Answers1

1

I found out what was the culprit. I simply didn't mark my interrupt handlers with __attribute__((interrupt)). G++ was pushing ebp for some reason as the entry to the function and it was messing up my interrupt stack. I simply removed the iretq and I'm letting g++ do the job of correctly returning from the interrupt.

Also, I needed to use -mgeneral-regs-only -mno-red-zone g++ options to avoid some compiler errors.

user123
  • 2,510
  • 2
  • 6
  • 20
  • 2
    The other option (to manually use `iretq` in inline asm) is to use `__attribute__((naked))` on your function, but then the only supported thing is to write the entire function body as a Basic Asm statement (no constraints, no plain C outside the asm.) And yes, you need -mno-red-zone for code that uses a stack which handles HW interrupts, and if you don't want to enable SSE and handle context-switching them FPU/SIMD state, also disable using MMX or XMM registers. That's not a "compiler error", although one might wish `-ffreestanding` would imply those by default. – Peter Cordes Jul 01 '21 at 05:58
  • Related: [Why can't kernel code use a Red Zone](https://stackoverflow.com/q/25787408) – Peter Cordes Jul 01 '21 at 06:19
  • Re: pushing EBP: `-fno-omit-frame-pointer` is on by default at `-O0`, and applies to *all* functions, except ones that use `__attribute__((naked))`. – Peter Cordes Jul 01 '21 at 06:20
  • Thank you @Peter Cordes for clarification. – user123 Jul 01 '21 at 14:26