How does a pipelined processor guarantee atomicity?

Question

When a processor executes a single instruction, this can be assumed to be an atomic operation. But how does that work when the processor uses pipelining? The instruction is executed in a number of steps, in parallel with many other instructions, all at different steps. But what if one of those other instructions interferes with ours? How can the processor "roll back" the effects of the instruction, or avoid interference altogether?

Atomicity normally refers to what's visible to other threads, as in [Can num++ be atomic for 'int num'?](https://stackoverflow.com/q/39393850) . A single core can track what it's doing to maintain the illusion of running one instruction at a time, in program order, including loads and stores (e.g. via a store buffer: [Can a speculatively executed CPU branch contain opcodes that access RAM?](https://stackoverflow.com/q/64141366)) — Peter Cordes, May 13 '23 at 19:04

DannyMeister · Accepted Answer · 2012-04-26T20:14:46.047

There are many strategies employed by various processors, I am sure. I once had a project where I added pipelining to a simulated processor. The techniques I employed were

Bubbling. For certain operations that have a chance of interfering with later instructions, I know how far back the interference might occur. For example, if a conditional jump evaluation is not complete until the following instruction has already passed through one stage of the pipeline, I might place what is effectively a NOP into the pipeline directly behind the conditional jump.
Forwarding. Under a couple conditions I was able to see that I needed a value from the instruction that was a stage or two ahead of the current one, but that had not yet been copied into the register/gate that it normally accesses it from. In this case, it accesses it directly from the later stage.
Branch Prediction and correction. The prediction part isn't so much about how to avoid collisions, but it is important to note. In the case of conditional jumps, you want to make a prediction about what will occur and load the next instruction into the pipeline as early as possible. I always assumed that the condition would evaluate such that a jump is NOT taken, because then I can immediately load the next instruction into the pipeline without first evaluating the jump address, thereby avoiding the need for a bubble.

When the prediction comes true, yay, I am happy. If the prediction does not come true, then I need to negate the effect of the next instruction that we optimistically started up early. I did this by switching a signal to the nand gates within the previous couple pipeline stages to effectively NOP out the instruction that was currently executing there.

This is what I remember from my only personal experience. I took a look at the wikipedia page for Instruction Pipeline and see some of those same ideas present, with far better explanation, I'm sure :) http://en.wikipedia.org/wiki/Instruction_pipeline

So basically, there is no single strategy, but a large number of different techniques from which processor designers can choose. Thanks for the real-life examples. — Koen Van Damme, May 01 '12 at 19:38

score 1 · Answer 2 · answered Apr 26 '12 at 19:45

1

This is defined by a designer of a prcessor, and it can be different for each particular processor. For example if we take typical Intel/AMD x86/x64 processor family, single instruction is not always atomic.

You must always say what processor type you are talking about. And if it is a different platform than x86/x64, you can probably get better answer at Electronics forum, not here.

answered Apr 26 '12 at 19:45

Al Kepp

5,831
2
28
48

1

Might be the difference between atomic with respect to the relevant core, or the whole processor. And then there is cache and the memory subsystem... – Marco van de Voort Apr 27 '12 at 12:20

How does a pipelined processor guarantee atomicity?

2 Answers2