How does a CPU handle asynchronous interrupts?

Question

CPUs split one instruction into several micro-ops, this works for x86 and ARM cores and maybe several other architectures. Micro-ops could be executed out-of-order and are stored in a ROB. They are retired in-order from this ROB.

I think of 2 possible implementations:

CPU will continue executing all micro-ops which are already in ROB, temporally ignoring the coming interrupt.
CPU will flush its pipeline. But if the pipeline is flushed, could we face a situation when some micro-ops of an instruction are retired, while other micro-ops of the same instruction are flushed from ROB? And what resources will remain in a pipeline after the interrupt is raise

understand that arm is not microcoded like x86...arm will use terms like microarchitecture, but that can apply to pretty much every processor as they all will have state machines to mange various things like fetching and loads and stores. Yes I know you did not say microcode but microarch, just clarifying that they are more different than similar in their implementation (x86 vs arm). — old_timer, Jan 06 '22 at 18:39
@old_timer ARM cpus could split instruction into "internal micro-ops" (As it is stated in [2.1 Pipeline Overview of ARM Optimisation Guides](https://developer.arm.com/documentation/uan0016/a/)). Yes, it's not as fancy as x86, it doesn't have any microcode sequencer, but arm cpu could have hardwired decoders. AFAIK that store and loads on multiple registers are split into several load/stores, maybe it also works for some NEON instructions. — nishima, Jan 06 '22 at 19:08
I would not call x86 fancy instead I would use archaic or ancient. load stores are part of all cpus and you just use a state machine nothing fancy needed, nothing from the last few decades needed to implement. granted the might, but nothing special there needed to implement the instructions. even load/store multiple. I would still be very careful in trying to lump the two into the same category that is all. — old_timer, Jan 06 '22 at 22:32
particularly with respect to how the implementation of the execution of an instruction is done and as a side effect, how one interrupts and restarts... — old_timer, Jan 06 '22 at 22:38
in the end though I think Peter covered it in his answer well enough, if more detail is needed there are many different x86 implementations that the answer would be different for each same goes for arm and mips and others. so a too broad question, resulting in a pretty good somewhat broad answer. — old_timer, Jan 06 '22 at 22:41

Peter Cordes · Accepted Answer · 2022-01-06T17:17:11.160

Interrupts are definitely always taken on instruction boundaries, even if that means discarding partial progress and restarting execution after interrupt return, at least on x86 and ARM microarchs. (Some instructions are interruptible, like rep movsb has a way to update registers. AVX2 gathers are also interruptible, or at least could be; the mask-updating rules might only ever get applied for synchronous exceptions encountered on one element).

There's some evidence that Intel CPUs let one more instruction retire before taking an interrupt, at least for profiling interrupts (from the PMU); those are semi-synchronous but for some events don't have a fixed spot in the program where they must be taken, unlike page faults which have to fault on the faulting instruction.

A multi-uop instruction that's already partially retired would have to be allowed to finish executing and retire the whole instruction, to reach the next consistent architectural state where an interrupt could possibly be taken.

(Another possible reason for letting an instruction finish executing before taking an interrupt is to avoid starvation.)

Otherwise yes, the ROB and RS are discarded and execution is rolled back to the retirement state. Keeping interrupt latency low is generally desirable, and a large ROB could hold a lot of cache-miss and TLB-miss loads making the worst-case interrupt latency really bad, so a malicious process could hurt the capabilities of a real-time OS.

When an interrupt occurs, what happens to instructions in the pipeline?
Estimating of interrupt latency on the x86 CPUs
(maybe) Reliability of Xcode Instrument's disassembly time profiling mentions performance event sampling.

Many ARM processes have options to let instruction complete **OR** restart. One is compute friendly, the other latency friendly. There is no **answer**. If a memory access restarts and the access it to an non-memory device, all sorts of issues can happen. For instance, a hardware device FIFO count read which resets the count to zero. This can not be answered for **ARM** in general. It does not use PIO and it can be difficult to tell a memory access from a peripheral, unless an MMU is active to denote a memory type. — artless noise, Jan 06 '22 at 19:06
`rep movsb` is like `ldm rx, {rx, ry, rz, ....}` specifically has an option to restart or not, at either boot or via a configuration of the CPU done at implementation (synthesizing the actually gate layouts). Each has a read only register to describe which mode is active. — artless noise, Jan 06 '22 at 19:11
An MMIO access cannot be allowed to proceed until it is senior (that is, all preceding instructions have retired). Once the MMIO access has begun, the access and the instruction that initiated it must retire before the interrupt can be recognized, to avoid exactly the issue @artless described. Any processor that has out-of-order execution and MMIO must necessarily behave this way. If it cannot distinguish MMIO from memory, then it cannot perform any accesses out of order. — prl, Jan 06 '22 at 21:50
In x86, "MMIO" is identified by memory type UC, rather than by whether it is actually routed to MMIO. — prl, Jan 06 '22 at 21:56

How does a CPU handle asynchronous interrupts?

1 Answers1

Linked