Assembly difference between [var], and var

Question

I'm learning Assembler and getting to the point where I actually have no clue about the difference between [variable] and variable. As the tutorials say, both are pointers, so what is the point of this? And why do I have to use a type Identifier before []? my assembler: nasm x86_64 running on Linux--> Ubuntu

Have you used C before? Or other systems programming languages? (if we get an idea of your preexisting knowledge, it'll help us give an answer you can understand) — Cauterite, Sep 13 '16 at 16:10
What assembler are you using? What type of assembly are you learning? There are many types. — Jose Manuel Abarca Rodríguez, Sep 13 '16 at 16:11
i can programm pascal and c++. And i know how to use Pointers — Maximilian Wittmer, Sep 13 '16 at 16:12
i am using the Nasm assembler in the 64-bit version. sorry, i didn't tell :) — Maximilian Wittmer, Sep 13 '16 at 16:12
Possible duplicate of [Brackets on registers in Intel x86 assembly syntax](http://stackoverflow.com/questions/16156933/brackets-on-registers-in-intel-x86-assembly-syntax) — Michael Petch, Sep 13 '16 at 18:12
@MichaelPetch: This one seems to be more about symbol names instead of registers. And the answers there don't mention `mov eax, sym` as mov-immediate. They're highly related, but I think they both need to link to each other for a complete answer to both issues. I think this question might be the better canonical question because of simplicity, but IDK. I mentioned some of this NASM vs. MASM stuff in my [x86 addressing modes answer](http://stackoverflow.com/a/34058400/224132), which does cover register-direct (non-memory) operands as well. — Peter Cordes, Sep 13 '16 at 18:43
The issue with your answer (in the link above) is that it barely touches on how MASM handles brackets (Ross ridge's answer is much more detailed - the one in the comment below one of the answers). Your answer was actually on a question tagged MASM but you barely talked about MASM. So I don't think the way that other question is worded makes your answer there as on topic as it should have been. A separate new question with an answer borrowing from your answer and Ross's would be more ideal. Although that might make it too broad, but I think a reasonably answer is possible for most assemblers. — Michael Petch, Sep 13 '16 at 19:12

score 15 · Accepted Answer · edited Oct 29 '19 at 01:29

In x86 Intel syntax [expression] means content of memory at address expression.
(Except in MASM when expression is a numeric literal or equ constant with no registers, then it's still an immediate)

expression without brackets depends on Assembler you are using.

NASM-style (NASM, YASM):

mov eax,variable      ; moves address of variable into eax
lea eax,[variable]    ; equivalent to the previous one (LEA is exception)
mov eax,[variable]    ; loads content of variable into eax

MASM-style (also TASM and even GCC/GAS .intel_syntax noprefix):

mov eax,variable      ; load content of variable (for lazy programmers)
mov eax,OFFSET variable   ; address of variable
lea eax,[variable]    ; address of variable
mov eax,[variable]    ; content of variable

GAS (AT&T syntax): It's not Intel syntax, see the AT&T tag wiki. GAS also uses different directives (like .byte instead of db), even in .intel_syntax mode.

In all cases the variable is alias for symbol marking particular place in memory, where the label appeared. So:

variable1  db  41
variable2  dw  41
label1:

produces three symbols into symbol table, variable1, variable2 and label1.

When you use any of them in the code, like mov eax,<symbol>, it has no information whether it was defined by db or dw or as label, so it will not give you any warning when you do mov [variable1],ebx (overwriting 3 bytes beyond the defined first byte).

It's simply just an address in memory.

(Except in MASM, where the db or dd after a label in a data section does associate a size with it that "variable name".)

Type identifier is only required in most of the assemblers when the type can't be deduced from the instruction operands itself.

mov [ebx],eax ; obviously 32 bits are stored, because eax is 32b wide
mov [ebx],1   ; ERROR: how "wide" is that immediate value 1?
mov [ebx],WORD 1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD [ebx],1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD PTR [ebx],1 ; MASM/TASM syntax

GAS `.intel_syntax noprefix` is pretty much like MASM for this, where you use OFFSET to get the address as an immediate, and a bare symbol name is a load. (Even if it was a `.set sym, 1234`, because there's no magic based on what kind of symbol it is) — Peter Cordes, Sep 13 '16 at 16:46
Also, did you search for duplicates before answering this? There's no way this is the first time this question has been asked and answered :( If there are older versions with worse answers, we can close them as dups of this. — Peter Cordes, Sep 13 '16 at 16:48
@PeterCordes: No. I tried now, and failed big time, searching for [] characters with SO search is sort of futile (although I have no idea about any extended features of the SO search engine, maybe it's possible somehow). — Ped7g, Sep 13 '16 at 16:51
Apparently you [can search for `code:[]`](http://meta.stackoverflow.com/questions/331867/why-arent-we-told-we-can-use-special-characters-in-search) to look for it inside code blocks. Maybe I'll come back to this and look for dups myself, probably searching for "brackets" and stuff. — Peter Cordes, Sep 13 '16 at 16:53
In MASM the brackets in your examples are actually meaningless. Brackets are ignored by MASM unless they surround register names. — Ross Ridge, Sep 13 '16 at 17:27
A more thorough discussion of the _MASM_ syntax and its peculiarities can be found in @RossRidge answer to a similar (but not duplicate) question: http://stackoverflow.com/a/25130189/3857942 — Michael Petch, Sep 13 '16 at 18:06

Ryan B. · Answer 2 · 2016-09-13T16:23:51.363

10

A little example using registers and pointers:

mov eax, 10 means: move into the register EAX the value 10. In this case, EAX is used just to store something. What EAX contains doesn't matter at all to the programmer, since it will be erased anyway.

mov [eax], 10 means: move the value 10 into the address stored in EAX register. In this case, the value stored in EAX matters a lot to us, since it's a pointer, which means that we have to go EAX register and see what is contains, then we use this value as the address to access.

Two steps are then needed when you use a pointer:

Go to EAX, and see what value it contains (for example EAX = 0xBABA) ;
Go to the address pointed by EAX (in our case 0xBABA) and write 10 in it.

Of course, pointers are not necessarily used with registers, this little example is just to explain how it works.

edited Sep 13 '16 at 16:23

answered Sep 13 '16 at 16:17

Ryan B.

1,270
10
24

ok that was helpful. THanks a lot. But for the Integer-ASCII problem (described at the top): have you got a clue? – Maximilian Wittmer Sep 13 '16 at 16:20
1

This is a QA site, not a forum. This means that usually there is one question per thread. I highly suggest that you edit your question and remove the second question, and then open a new thread with the new question. – Ryan B. Sep 13 '16 at 16:21

score 2 · Answer 3 · answered Sep 13 '16 at 16:29

2

Since you already know C++, I'm going to answer by showing you what the C equivalents of these expressions are.

When you write

[variable]

in assembly, it's equivalent to

*variable

in C. That is, treat variable as a pointer and dereference that pointer — get the value the the pointer is pointing to.

Similarly, the 'type identifiers' are like casting the pointer to a different type:

ASM:
    dword ptr [variable]
C:
    *((uint32_t*) variable)

ASM:
    word ptr [variable]
C:
    *((uint16_t*) variable)

I hope this helps you understand the meaning of these expressions.

(this section refers to an addendum that has since been deleted from the original question)

I'm not entirely sure what problem you're experiencing with 'conversion to ascii', but I suspect you're just confused by how it's visually rendered in output or something.

For example if you have code like this:

myInteger db 41
mov AL, byte ptr [myInteger]

the mov will copy the value 41 from memory into the AL register. The number 41 happens to be the ascii representation for the ) character, but this doesn't change anything. Whether the value is interpreted as an ascii character or as an integer is up to you, because they are the same value.

answered Sep 13 '16 at 16:29

Cauterite

1,637
17
24

the mov will copy the value 41 from memory into the AL register. The number 41 happens to be the ascii representation for the ) character, but this doesn't change anything. Whether the value is interpreted as an ascii character or as an integer is up to you, because they are the same value. yeah, but i want the user to see the '41'. how can i do this? BTW: good answer – Maximilian Wittmer Sep 13 '16 at 16:31
@TheFrenchPlaysHdMicraftn: You must produce two (or more) bytes, which will hold ASCII values for characters '4' and '1' (which happens to be 0x34 and 0x31). So one of the possibilities is to take `41` and keep dividing it by 10 until zero, building the string from end with the remainders (`OR remainder,0x30` to get ASCII digit). – Ped7g Sep 13 '16 at 16:35
@TheFrenchPlaysHdMicraftn you'll need to use a number-to-string conversion function, such as `sprintf` from the C standard library. I'm not sure what the easiest way to do this in assembly is I'm afraid. Whether C library functions are available depends what libraries your assembly program is linked with. Maybe google for "sprintf assembly" (or you can use that algorithm Ped7g mentioned above me) – Cauterite Sep 13 '16 at 16:35
ok. and to calculate them i dont need to converse them, am i rigth? – Maximilian Wittmer Sep 13 '16 at 16:37
@TheFrenchPlaysHdMicraftn by "calculate them" do you mean "perform arithmetic with them" ? – Cauterite Sep 13 '16 at 16:41
yes. I want to perform calculation with the values. – Maximilian Wittmer Sep 13 '16 at 16:42
right, yes. to use an integer in calculations you don't need to convert it to a string. – Cauterite Sep 13 '16 at 16:42
@TheFrenchPlaysHdMicraftn you start with value 41. Allocate some memory for output, let's say 8 ASCII characters = 8 bytes of space reserved. You want to end with that buffer set to 0x20 for first 6 bytes (spaces), seventh byte is `0x34` ('4'), eighth byte is `0x31` ('1'). So if you have between in some stage of conversion value `4` (remained after div by 10), you must convert it to ASCII `0x34` (`52` decimal). When you output something as ASCII string to terminal, the values are interpreted by ASCII table, so `52` is shown as font character '4'. When you do calculations with them, `52` is 52. – Ped7g Sep 13 '16 at 16:43
is there a possibility to mark the thread as finished? if yes, how can i do it? I think, i got all of the responses – Maximilian Wittmer Sep 13 '16 at 16:46
@TheFrenchPlaysHdMicraftn just stop asking. ;) BTW, while doing arithmetic, just make sure the types you use fits your needs. Example: `mov al,100` `mov bl,200` `add al,bl` will end with `44` in `al` (`300` truncated to 8 bits). Going to 16b register variants is enough for 300: `mov ax,100` `mov bx,200` `add ax,bx` => `ax` is now `300` (`al` is `44` (! yes, 8 bits), `ah` is `1` (the missing part from 8 bit version)). – Ped7g Sep 13 '16 at 17:01
ok. Thank you :). Now i can continue my NASM-tutorial without being anxious about thinking, i could be missing an info. COMPILER-BUILDING, I COME! – Maximilian Wittmer Sep 13 '16 at 17:07

Assembly difference between [var], and var

3 Answers3

Linked

Related