2

TL;DR

Is [memloc] referring to the value or the address? If it's referring to either, then why does it work both as a value and an address? (see code below, lines 4 and 5)

Full question...

Sorry for the long question. I'm confused by label dereferencing in NASM. Take this example:

01| section .text
02| ; exiting the program with exit code "15"
03|
04| mov     [memloc], 15 ; move 15 into memloc
05| push    [memloc]     ; push memloc on stack
06| mov     eax, 1       ; prepare exit syscall
07| call    kernel       ; invoke syscall
08|
09| section .data
10| memloc: dd 0    ; let's say this is at address 0x1234

When I run it, it exits with code 15. It works!
...but why? Shouldn't memlock be without braces line 4, where push presumably expects a destination?

For example:
The mov instruction at line 04 moves the value 15 to the ADDRESS of memloc:

mov     [memloc], 15 ; move 15 into mem @memloc

But line 05 pushes the VALUE stored at memloc onto the stack:

push    [memloc]     ; push value @memloc on stack

So, is [memloc] the value (15) or the address (0x1234)? What happens in theory if you mov memloc, 15 instead?

Thank you in advance.

Community
  • 1
  • 1
James M. Lay
  • 2,270
  • 25
  • 33
  • If the operation treats its operands differently based on whether you put braces around it, then I guess my question is 'how does the machine code differentiate between addresses and values?' – James M. Lay Sep 05 '15 at 10:50
  • It doesn't, which is the reason you have to put braces around it. – Siguza Sep 05 '15 at 11:04

2 Answers2

2

There's more than 1 version of the mov instruction. If the compiler (NASM) sees the square brackets around memloc it generates one form of mov and if your compiler doesn't see the square brackets around memloc it generates another form of mov.

Consider the following instructions:

mov edx, memloc
mov edx, [memloc]
mov [memloc], edx

They're all mov to/from the same destination/source register EDX but the compiler (NASM) will generate completely different opcodes for these instructions.

The 1st mov is encoded with 5 bytes 0xBA, ?, ?, ?, ?
The 2nd mov is encoded with 6 bytes 0x8B, 0x15, ?, ?, ?, ?
The 3rd mov is encoded with 6 bytes 0x89, 0x15, ?, ?, ?, ?

The 4 ?'s represent the address of memloc as assigned by NASM.
Using the example address (0x1234) in your question this would become:

The 1st mov is encoded with 5 bytes 0xBA, 0x34, 0x12, 0x00, 0x00
The 2nd mov is encoded with 6 bytes 0x8B, 0x15, 0x34, 0x12, 0x00, 0x00
The 3rd mov is encoded with 6 bytes 0x89, 0x15, 0x34, 0x12, 0x00, 0x00

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • This is definitely what was causing my confusion. I was thinking that the brackets denote a change in the reference, when they really change the operation. Thank you for your answer! – James M. Lay Sep 24 '15 at 21:01
1

What happens in theory if you mov memloc, 15 instead?

NASM would not except this because you can't move an immediate value (15) into another immendiate value (memloc).

The mov instruction at line 04 moves the value 15 to the ADDRESS of memloc:

The instruction at line 4 does not change the address of memloc.
Just like line 5 it uses the value stored at memloc.

So, is [memloc] the value (15) or the address (0x1234)?

[memloc] is the value 15
memloc is the address 0x1234 (which you can't change after it has been set by the line 10 of your code)

Fifoernik
  • 9,779
  • 1
  • 21
  • 27
  • Okay, but the `mov` instruction must know the _address_ of where it's putting the information. Just like if I told you to put mail in a mailbox, you'd need to know the mailing address. I couldn't give you the mail that was in the box and expect you to magically find it, right? (assume none of the mail has the address on it, lol) – James M. Lay Sep 05 '15 at 20:38
  • 1
    @JamesM.Lay There's more than 1 version of the `mov` instruction. If the compiler (NASM) sees the square brackets around *memloc* it generates one form of `mov` and if your compiler doesn't see the square brackets around *memloc* it generates another form of `mov`. – Sep Roland Sep 06 '15 at 20:09
  • @user3144770 So, when you `mov [mem], eax`, the address is still passed, but to an opcode that expects an offset? So instead of viewing `mem` as a "find and replace" target, I should think of it more like a C pointer, where `mov [mem],eax` is like `*mem = eax` and `push [mem]` is like `push(*mem)`? – James M. Lay Sep 07 '15 at 02:02
  • (...if there were an analogous function to push in c that gave you programmatic access to the stack.) – James M. Lay Sep 07 '15 at 02:03
  • @user3144770 if you post that as an answer, I'll accept it. – James M. Lay Sep 09 '15 at 20:01