MARS Simulator has PC-relative addressing refer to the number of bytes instead of words?

Question

I'm currently reading the book of David Patterson and John Hennesy titled: "Computer Organisation and Design - 4th Edition". In some point the book says:

Since all MIPS instructions are 4 bytes long, MIPS stretches the distance of the branch by having PC-relative addressing refer to the number of words to the next instruction instead of the number of bytes.

As soon as I read that, I went over to the MARS Simulator to see that in practise. To my surprise, I observed that the $pc register doesn't follow that rule and has always the byte memory address stored.

Shouldn't the $pc register be something like this before the execution of the below instructions located in the instruction addresses?

Instruction Address |    $pc Content
                    |
    0x00400000      |     0x00100000
    0x00400004      |     0x00100001
    0x00400008      |     0x00100002
      ....          |         ....

Notice it increments by 4 though. `$pc` stores byte address but it's always aligned. To observe the behavior described, you'd have to look at the machine code of a branch and there you'd see word offset not byte. — Jester, May 29 '19 at 17:17
So $pc stores always byte address but instructions like j and jal have the address stored word-wise and a left shift happens when executing it? — NickDelta, May 29 '19 at 17:30

Peter Cordes · Accepted Answer · 2019-05-30T08:51:06.410

Another way to look at this: the low 2 bits of PC are fixed at 0.

Under your proposal, the high 2 bits would be fixed at 0. And jr $ra or other jump-to-register instructions would also have to left-shift the register instead of simply setting $pc = function-pointer.

(Or else the difference would be architecturally visible and accessing code as data or vice versa would have to be shift to convert from data address to code address for the same word.)

As Jester points out, $pc is a normal pointer to an instruction word, like MIPS is used to dealing with. MIPS uses byte-addressable memory, but word loads have to be aligned (until MIPS32r6). So $pc increments by 4 instead of being scaled by 4 every time it's used.

The only scaling needed with the actual design is of immediates for branch (I-type) and jump (J-type). See How to Calculate Jump Target Address and Branch Target Address? for how that works. That's just a matter of what position you wire the immediate bits into an adder, leaving the lower bits zero. And it only happens in the decoding of those instructions; everything else just works with normal byte addresses.

Very nice observation , both you and Jester helped me understand a lot. — NickDelta, May 29 '19 at 18:59

score 0 · Answer 2 · answered May 30 '19 at 06:18

... stretches the distance of the branch by having PC-relative addressing refer to the number of words to the next instruction instead of the number of bytes.

This sentence only means:

The instruction 0x1000nnnn (b nnnn) stored at the address X will jump to the instruction at the address X+4+4*nnnn, not to the instruction at the address X+4+nnnn.

The sentence says nothing about the PC register and its value itself.

I observed that the $pc register doesn't follow that rule and has always the byte memory address stored.

Shouldn't the $pc register be something like this before the execution of the below instructions located in the instruction addresses?

The question here is: What exactly is the $pc register?

On some CPUs (like ARM), there are instructions that would be written as addu $t0,$t0,$pc in MIPS assembly language. Talking about such CPUs it is easy to answer the question:

The value of the $pc register is the value which will be added to $t0 if the instruction addu $t0,$t0,$pc is executed.

On real MIPS CPUs (not MIPS emulators), the PC register is some kind of memory (a 30-bit latch) which can hold 30 bits of information.

However, when talking about information stored in some memory, we have to define how to interpret the information:

The bits 10000111 may be interpreted as 135 (unsigned), 87 (BCD), -121 (two's complement), -120 (one's complement), -7 (sign and absolute), SOME_ENUM_CONSTANT (enumeration), -0.09, 0.7, 1.35 (various floating- and fixed-point variants) ...

It is well defined that the bits 0000...011100 in the PC register point to the instruction at address 0x70. However, it is not defined if this value has to be written as PC=0x1C or as PC=0x70.

Therefore some MIPS emulator might display $pc=0x400000 and another one might display $pc=0x100000 for the same value in the PC register!

However, I think (nearly) all MIPS emulators will display $pc=0x400000 because the users are interested in the address the instruction points to.

I think that you have made a little mistake, please correct me if I'm wrong. In J-type and branch instructions where a jump is performed,no calculation of the next instruction (that +4) occurs. This calculation is being done only on PC. PC always has the next instruction address to be executed (PC+4) — NickDelta, May 30 '19 at 08:42
@NickDelta: [How to Calculate Jump Target Address and Branch Target Address?](//stackoverflow.com/q/6950230) MIPS branch offsets are relative to the address of the branch-delay slot, i.e. to PC while the branch is "executing" — Peter Cordes, May 30 '19 at 08:48
@PeterCordes Very useful question, thanks for posting it, helped me understand even more. — NickDelta, May 30 '19 at 09:30
@NickDelta The text you are citing is explicitly speaking about "PC-relative addressing". This means that instructions like `BEQ`, `BGEZ`, `BNE` etc. are meant. J-type instructions are not PC-relative. — Martin Rosenau, May 30 '19 at 14:26

MARS Simulator has PC-relative addressing refer to the number of bytes instead of words?

2 Answers2