2

I'm attempting to convert hex to decimal, while shifting, and keeping the sign. Having an issue getting my 'simm' variable to properly disassemble signed instructions.

void disassembleInstr(uint32_t pc, uint32_t instr) {
    uint32_t opcode;      // opcode field
    uint32_t rs, rt, rd;  // register specifiers
    uint32_t shamt;       // shift amount (R-type)
    uint32_t funct;       // funct field (R-type)
    uint32_t uimm;        // unsigned version of immediate (I-type)
    int32_t simm;         // signed version of immediate (I-type)
    uint32_t addr;        // jump address offset field (J-type)

    opcode = instr >> 26;
    rs = (instr >> 21) & 0x1f;
    rt = (instr >> 16) & 0x1f;
    rd = (instr >> 11) & 0x1f;
    shamt = (instr >> 6) & 0x1f;
    funct = (instr & 0x3f);
    uimm = instr & 0xffff;
    simm = (instr << 16) >> 16; // shift sign bit to left to 
    addr = instr & 0x3ffffff; //masked with one

    cout << hex << setw(8) << pc << ": ";
    switch(opcode) {
        case 0x00:
        switch(funct) {
            case 0x00: cout << "sll " << regNames[rd] << ", " <<               regNames[rs] << ", " << dec << shamt; break;
            case 0x03: cout << "sra " << regNames[rd] << ", " << regNames[rs] << ", " << dec << shamt; break;
            case 0x08: cout << "jr " << regNames[rs]; break;
            case 0x10: cout <<  "mfhi " << regNames[rd]; break;
            case 0x12: cout << "mflo " << regNames[rd]; break;
            case 0x18: cout << "mult " << regNames[rs] << ", " << regNames[rt]; break;
            case 0x1a: cout << "div " << regNames[rs] << ", " << regNames[rt]; break;
            case 0x21: cout << " addu " << regNames[rd] << ", " << regNames[rs] << ", " << regNames[rt]; break;
            case 0x23: cout << " subu " << regNames[rd] << ", " << regNames[rs] << ", " << regNames[rt]; break;
            case 0x2a: cout << " slt " << regNames[rd] << ", " << regNames[rs] << ", " << regNames[rt]; break;
            default: cout << "unimplemented";
        }
        break;
        case 0x02: cout << "j " << hex << ((pc + 4) & 0xf0000000) + addr * 4; break;
        case 0x03: cout << "jal " << hex << ((pc + 4) & 0xf0000000) + addr * 4; break;
//        case 0x04: cout << "beq " << regNames[rs] << ", " << regNames[rt] << ", " <<  + uimm;   break;
//        case 0x05: cout << "bne " << regNames[rs] << ", " << regNames[rt] << ", " <<  + uimm;   break;
//        case 0x09: cout << "addiu " << regNames[rt] << ", " << regNames[rs] << dec << simm; break;
//        case 0x0c: cout << "andi " <<  regNames[rt] << ", " << regNames[rs] << dec << simm; break;
        case 0x0f: /* lui */ break;
      case 0x1a: cout << "trap " << hex << addr; break;
        case 0x23: /* lw */ break;
        case 0x2b: /* sw */ break;
       default: cout << "unimplemented";
    }
    cout << endl;
}

Here is an example of the wrong output I am getting:

400000: j 400114
400004: sw $ra, fffc($sp)
400008: sw $fp, fff8($sp)
40000c: addiu $fp, $sp, 65528
400010: addiu $sp, $fp, 65124
400014: addiu $k1, $zero, 1

Here is the intended output:

400000: j 400114
400004: sw $ra, -4($sp)
400008: sw $fp, -8($sp)
40000c: addiu $fp, $sp, -8
400010: addiu $sp, $fp, -412
400014: addiu $k1, $zero, 1

Edit: New output with implemented suggestion:

400000: j 400114
400004: sw $ra, fffffffc($sp)
400008: sw $fp, fffffff8($sp)
40000c: addiu $fp, $sp, -8
400010: addiu $sp, $fp, -412
400014: addiu $k1, $zero, 1
Nathan1324
  • 158
  • 1
  • 2
  • 12

1 Answers1

1

instr is an unsigned type (uint32_t) so shifting it left, and then shifting it right will simply clear the most significant bits. It won't do the sign extension that you were hoping for.

In fact, left shifting a 1 into the sign bit of a signed integer is undefined behavior according to the C standard. So even if instr were a signed number, the left-shift/right-shift trick would not be allowed (although it would work on any sensible machine).

To accomplish the task without violating any rules, replace this:

uimm = instr & 0xffff;
simm = (instr << 16) >> 16; // shift sign bit to left to 

with this:

uimm = instr & 0xffff;
simm = uimm;   
if ( simm & 0x8000 )
    simm -= 65536;
user3386109
  • 34,287
  • 7
  • 49
  • 68
  • Also: `simm = (int16_t)uimm;` which probably generates better assembly. – fuz Feb 02 '17 at 10:20
  • I tried your suggestion and I got nearly all 0's for all the offsets... Any ideas? – Nathan1324 Feb 02 '17 at 16:42
  • Also, I'm curious why you suggested to used the integer 65536? – Nathan1324 Feb 02 '17 at 16:47
  • @Nathan1324 Perhaps I wasn't clear. I have updated the answer. Your question has four examples, and you can try each of them by hand, e.g. `65124 - 65536 = -412`. And you can [prove it by exhaustive testing](https://en.wikipedia.org/wiki/Proof_by_exhaustion), since there are only 32768 values between 0x8000 and 0xffff. – user3386109 Feb 02 '17 at 20:39
  • @fuz True on 2's complement machines, but not portable in general. Per the C standard, §6.3.1.3/3 (emphasis added): *"Otherwise, the new type is signed and the value cannot be represented in it; either **the result is implementation-defined** or an implementation-defined signal is raised."* So if every nanosecond was critical, and portability was not a concern, then I would agree with your suggestion. However, given that OPs code is mostly a bunch of `cout` statements, I don't see how saving a few nanoseconds is of any use. – user3386109 Feb 02 '17 at 20:40
  • So I realized that the continuity of 0's I was getting for the offset values was in fact due to my error, not your suggestion. So what I am getting now is still hex. I edited the question to reflect the new output. – Nathan1324 Feb 02 '17 at 23:53
  • @user3386109 All interesting machines have the implementation-defined part you'd expect and personally I know not a single one's complement machine in contemporary use. You can safely ignore the other possibilities; if you can't, you would know anyway. – fuz Feb 03 '17 at 00:43
  • @Nathan1324 Good, glad to hear that you got it working. The hex output must be due to a statement like `cout << hex << simm`, but I can't say for sure because the code doesn't contain the implementation for the `sw` instruction. [This answer](http://stackoverflow.com/a/479377/3386109) and the comments underneath it have useful information about formatting. – user3386109 Feb 03 '17 at 08:32