https://blog.cloudflare.com/branch-predictor/ contains an excellent analysis of the performance of branches on modern hardware.
One thing that surprised me was the finding that unconditional jumps take up space in the branch target buffer. Why?
Conditional branches require use of the BTB because at the time when the CPU has just decoded the branch instruction and wants to fetch the next one, it does not yet know the value of the condition. But for unconditional jumps, there is no condition to know the value of. There is an offset that would need to be added to the IP where the jump instruction was found, but that is a constant in the instruction; it seems to me that you already have it by the time you have the opcode. What am I missing?