The image above is my solution to the optimal pipeline schedule associated to the sequence of five instructions on the left hand side.
There is a single stall before the branch instruction is fetched, which I've inserted because the branch comparison is performed during the ID stage and hence the branch instruction needs the correct value of $s2
before then. With the stall, the add
instruction's WB is aligned with the branch instruction's ID, meaning the branch instruction will correctly read $s2
from the register file in the second half of clock cycle 7.
With this said, a colleague is claiming that the stall is unnecessary and that $s2
can be forwarded directly from the add
instruction's EX stage to the branch instruction's ID stage. My confusion with this claim is that the branch instruction's dependency due to $s2
would not be detected until midway through its ID stage, by which time the desired value of $s2
has already shifted to the ME/WB buffer, rendering a forward from EX impossible.
Which solution is correct and why?