3

In MIPS, while using a jump instruction, we use a label.

again: nop
    $j again

So when we reach the jump instruction, we use the label again to show where to go and the value of the actual address there is used. I wanted to know where the label again is stored. Meaning, say nop is stored at 0x00400000, and the jump instruction is at 0x00400004. Where, then is again kept, how does MIPS know again is pointing to 0x00400000? Is it stored in the Dynamic Data area of the memory map? This is the memory map I've been provided for MIPS

I've also included the question which caused this confusion below, for reference.

Give the object code in hexadecimal for the following branch (be, bne) and jump (j) instructions.

... # some other instructions
again:  add ... # there is an instruction here and meaning is insignificant
    add ... # likewise for the other similar cases
    beq    $t0, $t1, next
    bne  $t0, $t1, again
    add ...
    add ...
    add ...
next:   j   again

Assume that the label again is located at memory location 0x10 01 00 20. If you think that you do not have enough information to generate the code explain.

ali.salemwala
  • 73
  • 1
  • 1
  • 11

3 Answers3

2

The label itself is not stored anywhere. It's just symbolic address for assembler/linker. The jump j again instruction opcode does store the actual resulting address, like a number.

The linker will glue together all object files, merging all symbols across object files and filling up correct relative addresses + creating relocation table for OS loader, producing executable file.

The OS upon loading the executable will also load the relocation table, modify/fill-up the instructions working with absolute addresses according to the actual address, where the binary was loaded, then throws the relocation table away, and executes the code.

So the labels are just "source thing" for programmer, alias for particular fixed memory address, to save programmer from counting actual instruction opcode sizes and calculating jump offsets in head, or memory variables addresses.

You may want to check the "list file" from your assembler (often /l switch), while compiling some assembly source, to see actual machine code bytes produced (none for labels).


Your "task" code when compiled at 0x00400000 looks like this (I set those add to do t1=t1+t1 to have anything there):

 Address    Code        Basic                     Source

0x00400000  0x01294820  add $9,$9,$9          4     add  $t1,$t1,$t1
0x00400004  0x01294820  add $9,$9,$9          5     add  $t1,$t1,$t1
0x00400008  0x11090004  beq $8,$9,0x00000004  6     beq  $t0, $t1, next
0x0040000c  0x1509fffc  bne $8,$9,0xfffffffc  7         bne  $t0, $t1, again
0x00400010  0x01294820  add $9,$9,$9          8     add  $t1,$t1,$t1
0x00400014  0x01294820  add $9,$9,$9          9     add  $t1,$t1,$t1
0x00400018  0x01294820  add $9,$9,$9          10    add  $t1,$t1,$t1
0x0040001c  0x08100000  j 0x00400000          11   next:   j   again

As you can see, each real instruction does produce 32bit value, which is called sometimes "opcode" (operation code), that value is visible in column "Code". The column "Address" is saying, where this value is stored in memory, when the executable is loaded, and prepared to be executed. The column "Basic" shows the instructions disassembled back from the opcodes, and at last position there is column "Source".

Now see how the conditional jumps encodes the relative jump value into 16 bits (beq $8, $9 opcode is 0x1109, and the other 16 bits 0x0004 are 16 bit sign extended value "how much to jump"). That value is meant as number of instructions away from "current position", where current is address of following instruction, ie.

0x0040000c + 0x0004 * 4 = 0x0040001c = target address

*4, because on MIPS every instruction is exactly 4 bytes long, and memory addressing works per byte, not per instruction.

The same goes for next bne, opcode itself is 0x1509, offset is 0xfffc, that's -4. =>

0x00400010 + (-4) * 4 = 0x00400000

The absolute jump uses different encoding, it's 6 bits opcode 0b000010xx (xx are two bits of address stored in the first byte together with j opcode, in this example they are zero) followed with 26b address divided by four 0x0100000, because every instruction must start at aligned address, so it would be waste to encode the two least significant bits, they would be always 00. 0x100000 * 4 = 0x00400000 ... I'm too lazy to check how it work on MIPS, but I think the j defines bits 2-27, 0-1 are zeroes, and 28-31 are copied from current pc maybe? Making the CPU capable to work over full 4GiB address range, but there's probably some special way how to jump between different "banks" (upper 4 bits of pc)) .. I'm not sure, I never did code for MIPS, so I didn't read the CPU specs.

Anyway, if you say the again: is at 0x10010020, all of these can be recalculated to follow that a produce functional code ready to be executed at 0x10010020 (although that j will be tricky, you would have to know for sure, how the total address is composed, if upper 4 bits are copied or what).

BTW, the real MIPS CPU does delayed branching (ie. the next instruction after branch jump is executed always, meanwhile the condition is evaluated, and the jump happens after the next instruction), and I think the pc used to calculate target address is also 1 instruction "later" one, so the correct code for real MIPS would have that beq ahead of the second add, but the relative offset would be still 0x0004. :) Simple eh? If it doesn't make sense to you, check MARS settings (the emulation of delayed branching is switched OFF by default, to not confuse students), and search google for some better explanation. Nice little funny CPU it is, that MIPS. :)

CL.
  • 173,858
  • 17
  • 217
  • 259
Ped7g
  • 16,236
  • 3
  • 26
  • 63
  • So then when in the question it says "Assume that the label again is located at memory location 0x10 01 00 20.", what does it mean? Thanks for your explanation, it cleared up a lot. – ali.salemwala Mar 05 '17 at 22:51
  • It means that the symbol `again` is alias for value `0x10010020`, and that 32 bit value is address, where the next byte defined in source after label `again:` will be loaded. I will add to answer short example of listing... MIPS, hmm, not sure if I will manage MIPS, will try with MARS, if not, I will add x86, as the principle is same everywhere. – Ped7g Mar 05 '17 at 23:00
  • actually I myself am using MARS as well. So if the question says 0x10010020 is the address where `again` points but text instructions cannot be held there according to the memory map, the question has a typo? – ali.salemwala Mar 05 '17 at 23:10
  • I'm not sure what you mean by "cannot be held there", maybe that question doesn't conform to the machine definition you are used to, and it's meant about machine which has a memory mapped there? `0x10010020` is valid 32bit value, somewhere after first 256MiB of memory, so for machine with 512MiB of RAM, and flat memory mapping, it may be valid address. Or for machine with CPU which allows for virtual memory mapping, then the memory map is defined by the OS, not some hard HW limit. – Ped7g Mar 05 '17 at 23:16
  • I did check the MARS a bit more.. so `0x10010000` is common address for .data segment, and the memory map can not be changed freely (there are 3 different presets). Also if I try to assemble instructions in `.data` segment, no machine code is produced (that's quirk of MARS simulator, ordinary assemblers don't work like that, they would happily produce machine code from those instructions inside .data segment) ... So a typo is possible. Then again, if you would have completely different MIPS OS, which would use different memory map, then the question is valid and solvable. – Ped7g Mar 05 '17 at 23:27
  • @Ped7g I don't understand what's the diffentce between the offset in the "code" column (the last 16 bits) and the address field in the "Basic" column. In your example they are the same, but in my homework they gave us this code segment: https://i.imgur.com/vnTxN8Z.jpg and ask us what's the address of "loop", how can I determine? – Avishay28 Nov 23 '17 at 14:22
  • @Avishay28 and what is different in your code? I see op-code `14210009` for `bne $1,$1,9` ... i.e +9 vs +9. So that's 9 words ahead from current next instruction. Next instruction is at `0x400014`, +9*4 = `0x400038` = address of `loop`. I don't understand what is "different" in my example. Can you give some detailed view, what is it? – Ped7g Nov 23 '17 at 14:44
  • @Ped7g look at line 34, the J operation. I need to determine loop's address only by this line, by using the method you described, I get wrong result:0040000c + 000e * 4 = 0x00400044 – Avishay28 Nov 23 '17 at 14:46
  • @Avishay28 that's because `j` calculates target address differently than branch jumps, not using address of next instruction, but only 4 top bits of `PC`, and taking 26 bits from instruction op-code (not 16). This answer looks to be nicely done (first google hit on "MIPS j instruction address calculation", try it sometimes, it's great search engine): https://stackoverflow.com/a/9795721/4271923 – Ped7g Nov 23 '17 at 15:34
  • @Avishay28 now I see I have it even in my own answer above explained *"The absolute jump uses different encoding,"* ... ??? Was it "too long to read" for you, or is the description wrong/difficult to understand? – Ped7g Nov 23 '17 at 15:36
  • @Ped7g I'm not sure why you're so angry. Thanks anyway, I'll check over what you sent. – Avishay28 Nov 23 '17 at 15:42
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/159694/discussion-between-ped7g-and-avishay28). – Ped7g Nov 23 '17 at 15:53
2

Each label corresponds to a unique address in memory. So in your example, and in agreement with what you stated, if the nop instruction exists at 0x00400000 then again will correspond (not point--more on that in a second) to that same address.

Labels can exist in both the text and data segments. However, in your example the label is shown in the .text: segment. So, it represent the address of an instruction as opposed to a variable.

Here's the important distinction:

Labels are a part of most ISAs to make writing assembly easier for humans. However, it's important to remember that assembly is not the final form of code. In other words, in the binary representation your label won't be much of a label anymore.

So, this is what will happen:

The assembler will recognize the memory address associated with each label's instruction. Let's keep our running example of 0x00400000. Then, in each jump instruction it will take this address and use it to replace the label in the opcode. Poof, no more labels and definitely no pointers (which would imply we would have another place in memory that is storing a memory address).

Of course, the memory address itself corresponds to a spot in the text segment in your example because it matches to an instruction.

Simply stated, labels exist to make our lives easier. However, once they're assembled they're converted to the actual memory address of the instruction/variable that they've labeled.

Faris Sbahi
  • 646
  • 7
  • 15
  • So then when in the question it says "Assume that the label again is located at memory location 0x10 01 00 20.", what does it mean? Thanks for your explanation, it cleared up a lot. – ali.salemwala Mar 05 '17 at 22:52
  • It means that the label is labeling the instruction that is stored at that memory location. And you're very welcome! – Faris Sbahi Mar 05 '17 at 22:53
  • But then according to the memory map I've been provided, that address (0x10010020) is in the Dynamic Data region (for Stack and Heap). Text is in a different range. Does that mean there's a problem with the question, or can instructions be stores in the stack/heap? – ali.salemwala Mar 05 '17 at 22:56
  • Instructions cannot be stored in the dynamic data segment. This must be a typo. – Faris Sbahi Mar 05 '17 at 23:05
0

The conversion of label to its corresponding address is done by the code assembler or MIPS simulator you are using, for example, MARS is a MIPS simulator, so MARS is doing that conversion. MARS will find the address of the label for you.