4

I'm needing to write a MIPS assembler in C/C++. Before I start just writing some code, I think I should actually take some time and do some planning first. There is about 15 MIPS instructions I need to account for, include J, but not JR. The program needs to take in a file that has .text,.data, and .word sections along with labels, then output a file with the first line being in decimal with the number of instructions and the number of words of data. The rest is the machine code encoded in hex. The final set of lines consists of hexadecimal values representing the initial values of the words in the data segment. I know I'll need to do 2 passes to first parse the labels and JUMP instruction. Basically I'm just looking for advice on how to setup the data structures. Should I do an array of strings that hold the OPCODE, the RS, RT, RD, etc... then convert that to hex somehow? Or is there a better way to do this from someone that has any advice/experience? Thanks for your help/suggestions!

  • 5
    Read up on lexing and parsing and even compiler theory. – Thomas Matthews Jan 30 '15 at 01:45
  • 3
    @ThomasMatthews he has to ? well, unless he needs to assemble some specific syntax with tricky expressions – mangusta Jan 30 '15 at 02:13
  • 1
    Check out my suggestion for writing an assembler using parsing technology: http://stackoverflow.com/a/1317779/120163 – Ira Baxter Jan 30 '15 at 02:54
  • 1
    You *are* aware that there are plenty of perfectly good MIPS assemblers out there already, right? (Or is this a class project?) –  Jan 30 '15 at 04:02
  • 1
    @duskwuff Yes, it is for a class project, that is why I'm wanting to do it myself and not just copy code. I am however looking for a little help in setting up the structure/(s) needed. – CoderRightInTheProgram Jan 30 '15 at 05:06
  • 2
    A MIPS assembler is rather more complex than an assembler for most ISAs, due to features like pseudo-instructions and branch delay slots. – EOF Jan 30 '15 at 09:44
  • If this is just an assembler, that all you should have to generate is obj code, right? You just need to match the asm nemonic's to the the correct machine code instructions. Classic theory. You generate a top to bottom left to right b-tree and you are off to the races. – FlyingGuy Feb 10 '15 at 04:00

1 Answers1

1

I actually did this a long time ago for something related to a class project! You're right about having to do 2 passes. However, don't use an array of strings for the registers. In fact you don't need to use strings at all. You can put the OPCODE in an enum, and the registers in an enum. For 15 instructions, you can easily do most of the work by handcoding switch-case, if-else statements rather than designing a fully generalized solution. It might be tempting to use regular expressions, but for your problem it's not worth the effort (though you should definitely use any opportunity you get to learn regex if you have the time!). Then use hashmap-like structures to map between the registers and OPCODE and their HEX values, and use those. You can do any address calculations directly in code. This is just a suggestion, you should definitely experiment. My main point is that if you are reading a string, you shouldn't store it in the same form when you can process it first and store something (read: object) more meaningful.

Basically, you only need the first pass for the labels etc. You can do everything else in the second pass. If you look at the basic typical compiler/assembler flow chart in any O/S textbook, you can easily emulate each step - that's what I did.

Hope this helps!

Aniruddha
  • 338
  • 1
  • 2
  • 6