How debuggers deal with out-of-order execution and branch prediction

Question

I know that modern CPUs do OoO execution and got advanced branch predictors that may fail, how does the debugger deal with that? So, if the cpu fails in predicting a branch how does the debugger know that? I don't know if debuggers execute instructions in a simulated environment or something.

score 1 · Accepted Answer · answered Aug 16 '22 at 22:24

1

Debuggers don't have to deal with it; those effects aren't architecturally visible, so everything (including debug breakpoints triggering) happens as-if instructions had executed one at a time, in program order. Anything else would break single-threaded code; they don't just arbitrary shuffle your program!

CPUs support precise exceptions so they can always recover the correct consistent state whenever they hit a breakpoint or unintended fault.

See also Modern Microprocessors A 90-Minute Guide!

If you want to know how often the CPU mispredicted branches, you need to use its own hardware performance counters that can see and record the internals of execution. (Software programs these counters, and can later read back the count, or have it record an event or fire an interrupt when the counter overflows.) e.g. Linux perf stat counts branches and branch-misses by default.

(On Skylake for example, that generic event probably maps to br_misp_retired.all_branches which counts how many branch instructions eventually retired that were at one point mispredicted. So it doesn't count when the CPU detected mis-prediction of a branch that was itself only reached in the shadow of some other misprediction, either of a branch or a fault. Because such a branch wouldn't make it to retirement. Events like int_misc.clear_resteer_cycles or int_misc.recovery_cycles can count lost cycles for the front-end due to such things.)

For more about OoO exec, see

Out-of-order execution vs. speculative execution (including in the context of the Meltdown vulnerability, which suddenly made a lot more people care about the details of OoO exec). A modern OoO exec CPU treats everything as speculative until it reached retirement (which happens in program order to support precise exceptions.)
Difference between In-oder and Out-of-order execution in ARM architecture
Why memory reordering is not a problem on single core/processor machines? OoO exec preserves the illusion (for the local core) of instructions running in program order.

answered Aug 16 '22 at 22:24

Peter Cordes

328,167
45
605
847

I read in a book called coders at work that Mr Jamie Zawinski once faced a problem in GDB due to branch prediction because it was on a machine with speculative execution and GDB only supported branch always taken. I wanted to know how modern debuggers fixed that. You may assume that I know computer architecture well (OoO exec, multi-core, speculative execution...etc) – Ahmed Ehab Aug 17 '22 at 04:53
@AhmedEhab: That doesn't make a lot of sense to me. Possibly there was some ancient machine without precise exceptions, where debug traps could happen spuriously if you had a breakpoint that wasn't along the true path of execution? If you know computer architecture, it should be obvious that precise exceptions make it a non-issue that debuggers don't have to worry about. (Neither do OS implementations of the APIs debuggers use, like `ptrace`). Just like page-faults, debug traps only happen when they would have if the machine executed 1 instruction at a time in program order. – Peter Cordes Aug 17 '22 at 05:12
See also [How does a breakpoint in debugger work?](https://stackoverflow.com/q/14598524) / [How does a debugger work?](https://stackoverflow.com/q/216819) re: how debuggers place breakpoints, e.g. by rewriting (the first byte of) an instruction to a trap like x86 `int3` (0xcc). – Peter Cordes Aug 17 '22 at 05:13
@AhmedEhab: Perhaps if you ask on retrocomputing about what kind of system that could have made sense on, someone might be able to identify what kind of CPU existed where GDB would have to understand its branch prediction. That makes no sense to me, since a debugger can only observe the architectural state, not mis-speculations. (Unless the machine has some very significant differences from normal machines.) – Peter Cordes Aug 17 '22 at 05:18
Okay now that explains it. So it enforces a sequential execution on processors. I may get now why it failed before maybe because early processors with speculative execution didn't impose this sequential execution. – Ahmed Ehab Aug 17 '22 at 05:21
@AhmedEhab: All CPUs with speculative exec still have to produce the correct non-speculative (architectural) state. Like my answer initially said, the cardinal rule of out-of-order / speculative exec is not to break single-threaded code. The whole point of out-of-order exec is to avoid *actually* requiring sequential execution while still maintaining the *illusion* of it. Very much like how a C++ compiler makes code that does what your program would do, but can get there in a different way. For a CPU running machine code, taking a debug breakpoint is an observable side-effect it preserves – Peter Cordes Aug 17 '22 at 05:30
I don't think you got what I meant. CPUs with speculative execution may take the wrong branch and after they discover that they will need to flush the instruction path they took. If I am enabling that while debugging, the debugger will take the same path as well before recovering back from the wrong path. What I meant is different from the reordering of instructions that compilers do statically. I think what answered my question is that u said that a debugger breakpoint enforces a sequential processing. What I said is in page8, coders at work – Ahmed Ehab Aug 17 '22 at 05:41
@AhmedEhab: Any side-effects can't be made visible until the CPU knows that they're on the correct path of execution. That includes storing to memory (handled by the store buffer), and traps to exception handlers (handled by in-order retirement for precise exceptions). Just like kernel's page fault handler can't run with an address resulting from mis-speculation, its debug-exception handler can't run with the wrong program-counter or other architectural state due to mis-speculation. A CPU won't act on an `int3` instruction unless/until it reaches retirement, i.e. non-speculative. – Peter Cordes Aug 17 '22 at 05:47
@AhmedEhab: So no, a *debugger* can't see mis-speculation. Whether you're single-stepping (e.g. with x86's TF) or running a debug-breakpoint instruction. The debugger doesn't "take a path of execution", it just regains control of the CPU upon an exception. (Or of the process, under a multi-tasking OS). **So it doesn't make sense to say that *a debugger* took a mis-speculated path of execution.** Unless you actually mean a *simulator* for a pipeline, that tries to be a cycle-accurate model of some OoO exec CPU, not just emulate the ISA. – Peter Cordes Aug 17 '22 at 05:51

How debuggers deal with out-of-order execution and branch prediction

1 Answers1