My question is how they co-exist and work together in modern CPU architecture?
-
Why should they interfere with each other? The first tells you whether or not to jump on conditional branches, the second tells you where to jump (on indirect ones) – Leeor Dec 02 '13 at 19:22
-
@Leeor But I kind of thinking the BTB is used for every instruction fetched from I$. And is indexed by the PC. Once there's a hit, there's no need for branch *prediction*, and we can go ahead and fetch the instruction at PC in the BTB. And if it's a miss, branch predictor comes into the play and predict the outcome of the branch. Given that BTB has a hit rate more than 90%, branch predictor is rarely used then... Where am I wrong? – fyang29 Dec 02 '13 at 19:43
-
You only want to use the value in BTB if the branch predictor says that you should predict that the branch is taken. For instance if the branch is only predicted taken for certain values of the branch history table (for a two-level adaptive predictor). – Danny Dec 02 '13 at 22:41
-
@Danny Thanks! I think it makes more sense now. – fyang29 Dec 03 '13 at 00:36
-
Related: [Branch target prediction in conjunction with branch prediction?](https://stackoverflow.com/q/21787457) – Peter Cordes Dec 14 '21 at 04:10
1 Answers
You've got it slightly reversed. On every fetch you index into your branch predictor, which tells you whether the instruction that you have just received will be decoded into a taken branch. If not, you fetch the next sequential address. But if your branch predictor says that it will be a taken branch, you don't know which instruction to fetch next, since you haven't decoded this instruction yet. So in order to not waste cycles waiting for the branch to resolve, you would use a Branch Target Buffer(or BTB). A BTB stores previous addresses where branch redirected the control flow. Using this mechanism you are trying to predict where the control flow will be redirected this time. This technique has 100% success rate for unconditional branches, function calls, and returns when paired with a Return Address Stack. On conditional branches the success rate is slightly lower, but is still really good given high temporal locality of branch targets. As an example you could consider a backwards branch of a loop, which will always branch to the same location.
When the branch instruction is actually resolved (usually in Decode or Execute stage of the pipeline, depending on the implementation), you will adjust the values in both the branch predictor and the BTB in order to have more up to date information for future predictions.
Here is a pictorial explanation how BTB lookup and update happen:
http://www-ee.eng.hawaii.edu/~tep/EE461/Notes/ILP/buffer.html

- 655
- 6
- 22
-
+1 Of course, real implementations can be more complex. E.g., the BTB may be untagged or use partial tags, so there is a chance of target prediction applying to a different branch (so even an unconditional direct jump can have its target mispredicted). As a cache of target predictions, even with full tags there is the potential for capacity and conflict (assuming it is not fully associative) misses. Indirect non-return jumps with multiple targets may also introduce mispredictions. Even the RAS can fail on overflow, context switch, branch misspeculation, etc. – Jul 29 '14 at 16:14
-
I have a few Qs. 1) How does the predictor vary between predicting for conditional branches and indirect branches, because indirect has more possible return addresses? 2) Isnt the BTB looked at first to see if the instruction is even a branch (to decide to go to the predictor)? 3) What type of branch misprediction can be detected at the decode stage? (I know about the execute stage realisation) – intrigued_66 Jan 19 '15 at 23:43
-
1Nice answer! I think there might be a typo: "On unconditional branches" -> "On conditional branches". – Carl Cook Oct 27 '17 at 23:14