What difference does it make to use NOP instead of stall. Both happen to do the same task in case of pipelining. I cant understand
-
Will the brainiac that answers this question take a stab at a second question from me: (generically) Does NOP imply context is irrelevant. So would a NOP cause a context change? Likewise Does stall imply context is relevant to me? – Sql Surfer Oct 12 '15 at 03:10
-
No. Context is OS level. We're talking micro-architecture. – Konrad Lindenbach Nov 04 '15 at 05:10
-
The accepted answer is fantastic, but one thing I want to add/clarify is that because each `nop` instruction takes up all 5 stages, adding 1 `nop` instruction between two dependent instructions will essentially delay the second instruction by one cycle. So if you have the example above using `add` and `sub`, you would put 2 `nop` instructions in between to stall for two stages. – nschmeller Mar 12 '19 at 21:55
2 Answers
I think you've got your terminology confused.
A stall is injected into the pipeline by the processor to resolve data hazards (situations where the data required to process an instruction is not yet available. A NOP
is just an instruction with no side-effect.
Stalls
Recall the 5 pipeline stage classic RISC pipeline:
- IF - Instruction Fetch (Fetch the next instruction from memory)
- ID - Instruction Decode (Figure out which instruction this is and what the operands are)
- EX - Execute (Perform the action)
- MEM - Memory Access (Store or read from memory)
- WB - Write back (Write a result back to a register)
Consider the code snippet:
add $t0, $t1, $t1
sub $t2, $t0, $t0
From here it is obvious that the second instruction relies on the result of the first. This is a data hazard: Read After Write (RAW); a true dependency.
The sub
requires the value of the add
during its EX phase, but the add
will only be in its MEM phase - the value will not be available until the WB phase:
+------------------------------+----+----+----+-----+----+---+---+---+---+
| | CPU Cycles |
+------------------------------+----+----+----+-----+----+---+---+---+---+
| Instruction | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+------------------------------------------------------------------------+
| 0 | add $t0, $t1, $t1 | IF | ID | EX | MEM | WB | | | | |
| 1 | sub $t2, $t0, $t0 | | IF | ID | EX | | | | | |
+---------+--------------------+----+----+----+-----+----+---+---+---+---+
One solution to this problem is for the processor to insert stalls or bubble the pipeline until the data is available.
+------------------------------+----+----+----+-----+----+----+-----+---+----+
| | CPU Cycles |
+------------------------------+----+----+----+-----+----+----+-----+----+---+
| Instruction | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+----------------------------------------------------------------------------+
| 0 | add $t0, $t1, $t1 | IF | ID | EX | MEM | WB | | | | |
| 1 | sub $t2, $t0, $t0 | | IF | ID | S | S | EX | MEM | WB | |
+----------+-------------------+----+----+----+-----+----+---+---+---+-------+
NOPs
A NOP
is an instruction that does nothing (has no side-effect). MIPS assembler often support a nop
instruction but in MIPS this is equivalent to sll $zero $zero 0
.
This instruction will take up all 5 stages of pipeline. It is most commonly used to fill the branch delay slot of jumps or branches when there is nothing else useful that can be done in that slot.
j label
nop # nothing useful to put here
If you are using a MIPS simulator you may need to enable branch delay slot simulation to see this. (For example, in spim
use the -delayed_branches
argument)

- 1
- 1

- 4,911
- 1
- 26
- 28
-
1But how does `NOP` differ from stalls in terms of effectiveness. In both cases every instruction after `inst 0` is going to be delayed by 2 CPU cycles, so why not just leave it with stalls and wait for hazards to clear out rather than using `NOP` which just inserts a gap in the pipeline doing nothing . I'm still confused. – np_complete Apr 18 '17 at 11:22
-
The example is the branch delay slot. The ISA defines that you have to put an instruction there, so if you have nothing that you would like to do then you should fill it with an instruction that does nothing. – Konrad Lindenbach Apr 18 '17 at 16:00
We should not use NOP in place of the stall and vice-versa.
We will use the stall when there is a dependency causing hazard which results in the particular stage of the pipeline to wait until it gets the data required whereas by using NOP in case of stall it will just pass that stage of the instruction without doing anything. However, after the completion of the stage by using NOP the data required by the stage is available and we need to start the instruction from the beginning which will increase the average CPI of the processor results in performance reduction. Also, in some cases the data required by that instruction might be modified by another instruction before restarting the instruction which will result in faulty execution.
Also, in the same way if we use the stall in the place of the NOP. whenever a non-mask-able interrupt occurs like (divide by zero) in execution stage we need to pass the stages after the exception without changing the state of the processor here we use NOP to pass the remaining stages of the pipeline without any changes to the processor state (like writing something into the register or the memory which is a false value generated to the exception).
Here, we cannot use stall because the next instruction will wait for the stall to be completed and the stall will not be completed as it is a non-mask-able interrupt (user cannot control these type of instructions) and the pipeline enters deadlock.