0

I'm writing an emulator and I'm having a bear of a time figuring out exactly what the CPU should do. I have a system that works, almost all the time. I can run the NOMMU RV32 Linux kernel if I very carefully pick my CLNT:CPU timer ratio to get really lucky.

I am mimicking what this Stack Overflow answer said as best as I can: RISC-V Interrupt Handling Flow

And, honestly, it seems to work out well. I can run user space programs and have my timer run at 250 Hz, except sometimes the kernel crashes, and more specifically, it's almost always on a __stack_chk_fail. After digging, I still have no idea how the Kernel's handling of interrupts doesn't crash everywhere. Since, it looks like on an interrupt, the following happens:

  • Exchange register tp for CSR CSR_SCRATCH if zero (happens first time) set them equal.
  • Read sp s0 etc... from tp's descriptor.
  • Operate on the interrupt.

BUT, tp appears to be the same as the currently running (interrupted task). And so it looks like changes to sp and more specifically the stack protector aren't understood in the interrupt, so it can overwrite memory that the interrupted task was currently operating in. In fact, I've stepped through and observed exactly this.

But my kernel seems to boot in QEMU, so I think the issue is with my code.


For reference, this is my setup: My timer ordering is:

  • If timer > timer_match AND timermatch != 0 set mip.mtip otherwise clear.
  • If mie.mtie == 1 AND mip.mtip == 1 AND mstatus.mie == 1 then fire interrupt.

When firing timer interrupt (atomically):

  • Set mstatus.mpie = mstatus.mie
  • Set mstatus.mie = 0
  • Set mepc = pc
  • Set mtval = 0
  • Set mtcause = 0x80000007
  • Set next PC = mtvec (Linux uses all-in-one interrupt handler)

When handling traps, I (atomically):

  • Set mepc = mtval = pc << Note: It seems that this is correct, the spec says so and the kernel advances mepc.
  • Set mtval = pc
  • Set mstatus.mpie = mstatus.mie
  • Set mstatus.mie = 0
  • Set next PC = mtvec

When calling mret I (atomically):

  • Set mstatus.mie = mstatus.mpie
  • Set mstatus.mpie = 1
  • Set next PC = mepc

Any ideas what's going on?

Charles Lohr
  • 695
  • 1
  • 8
  • 23

1 Answers1

0

This probably isn't 100% correct, but this seems to be able to make the system stable.

From pi-maker's rvc, it turns out it is crucial for the kernel to run properly you have to set the mstatus.mpp bits, otherwise it will overwrite memory.

The correct solution seems to be here: https://github.com/PiMaker/rvc/blob/master/src/trap.h#L48

But, for testing, I was able to get my project working by the following means. In addition to the operations described above...

If timer interrupt:

  • store last privilege into mstatus.mpp
  • Set new privilege to 3.

If a trap like ebreak:

  • store last privilege into mstatus.mpp
  • Set new privilege to 3.

If mret:

  • store last privilege into mstatus.mpp
  • Set new privilege to old mstatus.mpp

I don't use the privilege bits for anything else. (Note this is probably incorrect, because the selected CSRs should change depending on privilege level, but this seems to work so, ¯\(ツ)

Charles Lohr
  • 695
  • 1
  • 8
  • 23