0

Construct a C language sentence from the following MIPS instructions.
(var f -> $s0, starting address of array A and B -> $s6, $s7)


addi $t0, $s6, 4        //$t0 = &A[1]
add  $t1, $s6, $0       //$t1 = &A[0]
sw   $t1, 0($t0)        //A[1] = &A[0]
lw   $t0, 0($t0)        //$t0 = &A[0]
add  $s0, $t1, $t0      //f = &A[0] + &A[0]

On the left are the instructions given and comments on the right are me struggling to understand.
Final answer I got is f = &A[0] + &A[0], but that doesn't seem right. What am I getting wrong?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Sangmin Lee
  • 13
  • 1
  • 2
  • 1
    I agree, that makes little sense but seems to be correct. – Jester Sep 14 '20 at 20:19
  • 1
    `lw $t0, 0($t0)` is a memory load, not an effective address calculation. – Raymond Chen Sep 14 '20 at 20:20
  • 1
    @RaymondChen yes, but the previous instruction stored an address there. – Jester Sep 14 '20 at 20:20
  • @Jester True, I guess it depends on whether the comment is explaining each instruction individually or trying to abstract higher-level meaning. – Raymond Chen Sep 14 '20 at 20:21
  • 1
    @RaymondChen: The comment on the `lw` is correct, but yes you could include a reminder of the memory location being reloaded from, like `$t0 = A[1] = &A[0]`, as well as the original meaning of that value. – Peter Cordes Sep 14 '20 at 20:24
  • @RaymondChen thanks for the help, the problem seemed weird from the start 'cause it doesn't even include $s7 which it mentioned. Guess it's just not a good problem – Sangmin Lee Sep 14 '20 at 20:54
  • Yeah, this problem is super weird. It might be a mistake in the book or wherever you found it. Do note that your C expression omits the side-effect of updating memory. Expressing that in C might require some casting (e.g. to `uintptr_t`) because you can't actually add two pointers in C. – Peter Cordes Sep 14 '20 at 22:15
  • What's this from? There are multiple questions based on this code, because the code is so weird people assume they've read it wrong. [What's wrong in my thinking in translating MIPS code to C?](https://stackoverflow.com/q/66790508) / [MIPS to C Translation](https://stackoverflow.com/q/18885288) / [MIPS addi instruction to array base](https://stackoverflow.com/q/10472452). A google search for `site:stackoverflow.com addi $t0, $s6, 4` finds mostly this, it's a unique enough combo of operands. – Peter Cordes Mar 25 '21 at 03:12
  • 1
    @PeterCordes Must be from some textbook or something. I got it from a quiz in my lecture. – Sangmin Lee Mar 25 '21 at 13:46

1 Answers1

1

You're not crazy, the code really is that weird!

Adding two pointers basically never makes sense, so this is kind of a trick question.
The equivalent C does look wrong / insane:

intptr_t *A = ...;  // in $s6

 A[1] = (intptr_t)&A[0];
 f = A[1] + (intptr_t)&A[0];

Note that signed overflow is undefined behaviour in C, so it's legal to compile it to a MIPS add which will trap on signed overflow. If we'd used uintptr_t, the required overflow semantics would be wrapping / truncation, which add doesn't implement.

(Real-world C compilers for MIPS always use addu / addiu, not add, even for signed int, because undefined behaviour means anything is allowed, including wrapping. It's even required if you compile with gcc -fwrapv. Since MIPS is a 2's complement machine, addu is the same binary operation as add, it differs only in not trapping on signed overflow: when the inputs are the same sign but the output has a different sign from that.)


In terms of C that will compile back to something closer to the given asm, or at least represent every asm operations with a C temporary var:

I used GNU C register-global variables instead of function args so the function body would be using the actual correct register (and without cluttering the asm with extra instructions to save/restore and init those registers). So this lets me get GCC to make a block of asm that has s registers as inputs and outputs, instead of the normal calling convention.

#include <stdint.h>

register intptr_t *A  asm("s6");
// register char  *B  asm("s7");    // unused, no idea what type makes sense
register intptr_t f asm("s0");

void foo()
{
  volatile intptr_t *t0_ptr = A+1;  // volatile forces store and reload
  intptr_t t1 = (intptr_t)A;

  *t0_ptr = t1;                  //sw   $t1, 0($t0)       //A[1] = &A[0]
  intptr_t t0_int = *t0_ptr;     //lw   $t0, 0($t0)       //$t0 = &A[0]
  f = t0_int + t1;               //add  $s0, $t1, $t0     //f = &A[0] + &A[0]
  //return f;
}

Note that $t0 gets used for 2 different things here, with different types: one being a pointer into the array, and the other a value from the array. I expressed this with two different C variables, because that's how things normally go. (Compilers will reuse the same register for a different variable when one is "dead" before / as the other one is needed.)

The resulting asm from GCC5.4 for MIPS, with options to make MARS-compatible asm: -O2 -march=mips3 -fno-delayed-branch. MIPS3 means no load delay slots, like the code in the question which uses the lw result in the instruction after the load. (Godbolt compiler explorer)

foo:
        move    $2,$22         # $v0, $s6   pointless copy into $v0
        sw      $22,4($2)      # A[1] = A
        lw      $3,4($22)      # v1 = A[1]
        addu    $16,$22,$3     # $s6 = (intptr_t)A + A[1]
        j       $31
        nop                                  # branch-delay slot

(GCC uses numeric register names, not the ABI names like $s? for call-preserved, $t? for call-clobbered scratch regs, etc. http://www.cs.uwm.edu/classes/cs315/Bacon/Lecture/HTML/ch05s03.html has a table.)

Another way to write it, with less rigour: the important difference is the lack of volatile to force the compiler to reload.

void bar() {
  A[1] = &A[0];
  f = A[1] + (intptr_t)&A[0];
}
bar:
        move    $2,$22          # still a useless copy
        sw      $22,4($2)
        sll     $16,$22,1       # 2 * (intptr_t)A;   no reload, just CSE the store value.
        j       $31
        nop

Of course there would be other ways to express this, e.g. using A as an array of pointers instead of an array of intptr_t, int, or int32_t.

I chose integers because C pointer types magically scale by the type width when you do pointer addition.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847