Why are bgezal & bltzal basic instructions and not pseudo-instructions in MIPS?

Question

According to this MIPS instruction reference, there are two instructions (bgezal and bltzal) which perform a relative jump and link instead of just a relative jump if the branch is taken.

These instructions can be simulated with a bgez or bltz respectively, followed by a jal, which means that both bgezal and bltzalshould be classified as pseudo-instructions. However, both have opcodes assigned to them, hence they are classified as basic instructions.

What is the rationale for adding them to the basic instruction set and not making them pseudo-instructions? Also, why are only bgezal and bltzal included in the instruction set and not, for example blezal, bgzal etc?

It seems that you want to know the reason the designers had to design it that way. While I respect that kind of question, it's often regarded as off-topic, and the answer is often hard (or impossible) to find anyway. — harold, Dec 21 '13 at 13:09

Peter Cordes · Accepted Answer · 2019-10-29T01:43:28.973

jal uses a semi-absolute target encoding (replacing the low 28 bits of PC), while bgezal / bltzal are relative (adding an 18-bit signed displacement, imm16<<2). How to Calculate Jump Target Address and Branch Target Address?

They are classic MIPS's only branch-and-link (instead of jump-and-link), so are important for position-independent relocatable code. (You can even use one to get the current PC into a register and find out where you're executing from, unlike with jal).

You can encode bal (unconditional relative function call) as bgezal $zero, target.

You can get $ra=PC with a not-taken bltzal $zero, anywhere without needing any other setup. Doing that with bgezal would need a less-than-zero input register that would take an insn to create. b...al instructions always write $ra even if the branch is not taken. You want this for PC-relative code, until MIPS32r6 gave us addiupc for better PC-relative address generation.

Since they use an I-type instruction format like other branch instructions, there's room in the encoding for one register, so it made sense to make it optionally conditional instead of just having a bal instruction. The hardware logic for doing the "and link" was already there, and all the other relative branch instructions are conditional. Plus, having a condition that's not-taken for $zero could be convenient for reading pc.

Remember that MIPS instruction encodings were used directly as internal control signals in early MIPS hardware, so the one bit in the encoding that differs between them probably gets wired to an XOR gate that inverts (or not) the check on the sign bit. (As Konrad's answer points out, these branch conditions depend only on the MSB of the register because it's always against zero, so there's no latency of waiting for a 32-bit adder to produce a compare result.)

From http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html

0000 01ss sss1 0001 iiii iiii iiii iiii   BGEZAL
0000 01ss sss1 0000 iiii iiii iiii iiii   BLTZAL

This lack of flexibility in instruction-encoding (because it directly drove internal control signals instead of needing much transformation in decoding) is perhaps why there isn't just a single bal with a 28-bit range (from a 26-bit relative displacement). The hardware for relative branches was set up for I-type instructions with 16-bit immediates.

TL:DR: there are 2 conditional branch-and-link instructions because it was natural to implement an unconditional bal in terms of one of them, and the other came along nearly for free.

MIPS b (unconditional relative branch without link) is also a pseudo-instruction for beq $zero, $zero, target, or at the assembler's choice, for bgez $zero, target. (What is the difference between unconditional branch and unconditional jump (instructions in MIPS)?). The MIPS R3000 manual suggests beq $zero,$zero. (And more clearly documents that $ra=PC happens regardless of branching; That wasn't clear from the quick-reference sheets I was looking at while writing this answer originally.)

The compare-to-zero encodings only have one 5-bit register field, so they consume less coding space than beq / bne. That's a likely reason for choosing bgezal rather than beqal as one of the pair of conditional branches to provide.

I think your saying the two instructions exist because the design of the original MIPS chips made it the most natural way to implement a BAL instruction, and if so I agree with you. I don't have the same faith the other answers do that the full capability of the two instructions actually get used often enough to otherwise justify their existence. — Ross Ridge, Oct 06 '18 at 21:58
A not-taken branch-and-link still writes the link register, but it would actually have been possible to use a *taken* `bal` to read PC, with the relative target address being where fall-through would go: the instruction after the branch delay slot. That might be less efficient, although if it had become a common idiom of MIPS before `addiupc` existed, microarchitectures could maybe have special-cased that branch displacement some. (Like how x86 [avoids the equivalent 32-bit PIC idiom unbalancing return-address prediction](//blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0).) — Peter Cordes, May 27 '21 at 06:58

score 1 · Answer 2 · answered Dec 19 '13 at 11:38

bgez and bltz are not pseudo instructions.

bgezal and bltzal are the same, this is not strange.

Though it's RISC, not all instruction should be basic. Two instructioins need more memory and maybe more time if frequently used, and there are much space in opcodes, why not combine them to one?

score 1 · Answer 3 · edited May 23 '17 at 12:29

The main reason is efficiency.

Your initial assumption (that executing a bgez followed by a jal can be simulated by or is functionally equivalent to executing beqzal) is true, but it is probably less efficient to do so.

Why do pseudoinstructions exist in the first place? The University of Maryland's notes on pseudoinstructions and branching explain this. The answer lies in the way MIPS was designed. MIPS is a reduced instruction set. Instructions could stay in the ISA if there was a noticeable improvement in speed. If it could be written using two or more instructions, without a noticeable delay (because they weren't used too often), then those instructions were not included in the MIPS ISA. Rather, they became pseudoinstructions.

Lets take another pseudoinstruction, such as la, or load address. la is a pseudoinstruction that can be broken down into a lui instruction and an ori instruction. On a 32-bit MIPS architecture, each instruction as well as the size of each register is 32 bits. So in order to store a 32 bit address, two instructions are used. More information about the la instruction can be found here.

The bgezal and bltzal instructions are not psudoinstructions most likely because it is more efficient to perform the full operation in one instruction. The hardware must be able to perform the conditional check, jump to the branch address, and store the return address in one instruction. If the jal instruction was separated out, not only is this an unnecessary use of space, but on some hardware implementations this extra instruction could use up an execution cycle through the processor's data path, effectively slowing program execution.

Additional Sources:

@KonradLindenbach The original MIPS [R2000](http://en.wikipedia.org/wiki/R2000_(microprocessor)) and [R3000](http://en.wikipedia.org/wiki/R3000) chips were only able to complete about one instruction per cycle. — , Dec 22 '13 at 12:51
My understanding is that because MIPS is a Von Neumann architecture it cannot be a single cycle implementation which would require the separation of memory into instruction and data. — Konrad Lindenbach, Dec 23 '13 at 00:54
Perhaps I should not have used the phrase "single-cycle", I meant implementations like R2000 that perform at the rate of one instruction per cycle. I have edited my answer. Thanks @KonradLindenbach for pointing that out. — , Dec 23 '13 at 16:18
`jal` is semi-absolute (replacing the low 28 bits of PC), while `bgezal` / bltzal` are relative (adding an 18-bit displacement, imm16<<2). They are classic MIPS's only branch-and-link (instead of jump-and-link), so are important for position-independent relocatable code. (you can even use one to get the current PC into a register and find out where you're executing from, unlike with `jal`). [How to Calculate Jump Target Address and Branch Target Address?](https://stackoverflow.com/q/6950230) — Peter Cordes, Oct 06 '18 at 21:15

score 1 · Answer 4 · answered Dec 22 '13 at 05:26

1

What is the rationale for adding them to the basic instruction set and not making them pseudo-instructions?

Just because an instruction can be separated into pieces doesn't mean that it should be. I know this is a RISC, but there are still trade-offs to be made between the size of the instruction set and the performance of the system at large.

Two instructions implies longer execution time than a single instruction so the design team probably made the decision to include this instruction after seeing the impact that it would have on popular benchmarks.

Also, why are only bgezal and bltzal included in the instruction set and not, for example blezal, bgzal etc?

The simplest explanation I can give for this is that these instructions are easiest to implement: for both bgezal and bltzal only the sign bit must be checked.

answered Dec 22 '13 at 05:26

Konrad Lindenbach

4,911
1
26
28

I would expect that the designers of MIPS probably expected that when processing a "normal" branch instruction it would be necessary to use one pipeline stage to fetch the register operands and another to compare them before the system could know whether it was supposed to take the branch, but that logic which merely had to check one bit of one source register could be squeezed into the operand-fetch cycle, thus allowing such branches to run one cycle faster than they otherwise would. – supercat Feb 18 '15 at 01:07
@supercat: Doing compares during operand-fetch sounds like a neat idea until you consider the problem of `slt` / `bne` sequences, and in general forwarding from a previous ALU instruction. What MIPS I (R2000) *actually* did was run branch conditions in the first half-cycle of EX, and IF only starting in the 2nd half-cycle, so forwarding was possible and branch latency was limited to 1 cycle, fully hidden by 1 delay slot. [How does MIPS I handle branching on the previous ALU instruction without stalling?](https://stackoverflow.com/q/56586551) – Peter Cordes May 27 '21 at 06:51

Why are bgezal & bltzal basic instructions and not pseudo-instructions in MIPS?

4 Answers4

Linked