1

Why the first add needs forwarding?

                    # stage:
add $1, $2, $3      # WB
add $4, $5, $6      # MEM
nop                 # EX
beq $1, $4, target  # ID

Since beq needs the $1, if the first add is about to execute WB-stage, isn't that no forwarding needed since beq at ID-stage, which is about to read the register file? My book says the second and third instruction before the beq will need forwarding to avoid data hazard.


Edit: I found exactly what I meant on this link slide page 11; another slide that solves my another confusing that the first add isn't needed is by another technique special hardware, slide page 58.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Kindred
  • 1,229
  • 14
  • 41
  • 1
    Note that even with forwarding, a `nop` (or a stall in a later MIPS that detected the hazard) is needed after the 2nd add, because `beq` also needs `$4` *in the ID stage*. (MIPS branch instructions "execute" in the ID stage, which is why the branch-delay slot is only 1 instruction long, not 2.) – Peter Cordes Jan 10 '19 at 06:20
  • @PeterCordes: I realized in this morning that I should not always assume that it's a concurrent register-file in the ID stage. Your comment is always helpful for me, thank you so much. – Kindred Jan 10 '19 at 11:32

1 Answers1

4

In a synchronous digital system, during a cycle, there are two different phases. During the first phase, operands are read and transformed by means of operators. During the second phase, the resulting data are written to registers. Depending on the implementation,theses phases can correspond to the first and second half period, or to the complete period and the rising edge of the clock.

In either case, the important aspect is that it is possible to read (during the first phase) and to modify (at the end) the same register. This is why is is possible to do actions like

pc <= pc+4

in a single cycle.

In the problem that you raise, it is exactly what happens.

The action

  add $1, $2, $3      # WB

will read the pipeline register with the result during first phase and write is to $1 at the end of the cycle. while

 beq $1, $4, target  # ID

will read $1 and $4 during the first phase and write the result the ppline registers at the end of the cycle. Hence, without forwarding, it will be the previous value of $1 that will be written.

(edited according to comment below)

All these explainations are true is branch is dealt with standard HW. In that case comparison is done by the ALU at the "EX" stage and PC is updated at the end of this stage.

But this leads to a branch penalty of two cycles. To reduce this penalty, one can add HW to perform the comparison at the ID stage. In that case, if the comparison requires a value been computed (as $1 in your example), a stall will be required.

Alain Merigot
  • 10,667
  • 3
  • 18
  • 31
  • But the same problem is said with no data hazard (thus no forwarding needed) in the data hazard section... I understand your answer and it's clear but I still cannot sum up these ideas... – Kindred Jan 09 '19 at 03:36
  • I just found that the book I read is confusing, it is probably using two different techniques (one is by concurrent hardware, the other is forwarding like your great answer) but without saying it explicitly... , which made me thought that the first `add` my example would not need the forwarding if this kind of special hardware is used. – Kindred Jan 09 '19 at 03:49
  • 1
    I added some comments at the end of my answer. I do not know if the book is confusing, but branching is a very complex problem. Actually data hazards depends on the actual harware and two different hardware are considered. – Alain Merigot Jan 09 '19 at 07:21
  • Yes, original MIPS handled branches in the ID stage, as part of reading registers. This is why MIPS's 1 branch delay slot is sufficient to fully hide the pipeline bubble of a taken branch in a [classic 5-stage RISC](https://en.wikipedia.org/wiki/Classic_RISC_pipeline#Control_hazards), by resteering the IF stage. So yes, @ptr_NE, the first `add` also needs forwarding in this case. Without forwarding the 2nd add would have to have left the WB stage in the cycle before the `beq` could go through the ID stage, so it would take 3 `nop`s instead of the usual two for `add` feeding an `add`. – Peter Cordes Jan 10 '19 at 06:26