I'm writing an emulator and I'm having a bear of a time figuring out exactly what the CPU should do. I have a system that works, almost all the time. I can run the NOMMU RV32 Linux kernel if I very carefully pick my CLNT:CPU timer ratio to get really lucky.
I am mimicking what this Stack Overflow answer said as best as I can: RISC-V Interrupt Handling Flow
And, honestly, it seems to work out well. I can run user space programs and have my timer run at 250 Hz, except sometimes the kernel crashes, and more specifically, it's almost always on a __stack_chk_fail
. After digging, I still have no idea how the Kernel's handling of interrupts doesn't crash everywhere. Since, it looks like on an interrupt, the following happens:
- Exchange register
tp
for CSRCSR_SCRATCH
if zero (happens first time) set them equal. - Read
sp
s0
etc... fromtp
's descriptor. - Operate on the interrupt.
BUT, tp
appears to be the same as the currently running (interrupted task). And so it looks like changes to sp
and more specifically the stack protector aren't understood in the interrupt, so it can overwrite memory that the interrupted task was currently operating in. In fact, I've stepped through and observed exactly this.
But my kernel seems to boot in QEMU, so I think the issue is with my code.
For reference, this is my setup: My timer ordering is:
- If timer > timer_match AND timermatch != 0 set
mip.mtip
otherwise clear. - If
mie.mtie == 1
ANDmip.mtip == 1
ANDmstatus.mie == 1
then fire interrupt.
When firing timer interrupt (atomically):
- Set
mstatus.mpie = mstatus.mie
- Set
mstatus.mie = 0
- Set
mepc = pc
- Set
mtval = 0
- Set
mtcause = 0x80000007
- Set next PC =
mtvec
(Linux uses all-in-one interrupt handler)
When handling traps, I (atomically):
- Set
mepc = mtval = pc
<< Note: It seems that this is correct, the spec says so and the kernel advances mepc. - Set
mtval = pc
- Set
mstatus.mpie = mstatus.mie
- Set
mstatus.mie = 0
- Set next PC =
mtvec
When calling mret
I (atomically):
- Set
mstatus.mie = mstatus.mpie
- Set
mstatus.mpie = 1
- Set next PC =
mepc
Any ideas what's going on?