0

I'm currently learning Assembly. I want to write a program which ends up with three variables that point to each other:

0x804a000: 0x804a001  ; 0 -> 1
0x804a001: 0x804a002  ; 1 -> 2
0x804a002: 0x804a000  ; 2 -> 0

According to some other posts, I can retrieve (e. g. for mov):

  • the contents of a variable x using [x]
  • the address of a variable x using x

Here is what I came up so far:

section .bss
    head resd 3         ; reserve three dwords

section .text
  global _start
    
_start:

  xor eax, eax
  xor ebx, ebx
  xor ecx, ecx          ; counter = 0

  mov eax, head         ; move the address of head into eax: eax -> 0
  mov ebx, eax          ; move the address of head from eax into ebx: ebx -> 0
  add ebx, 2            ; ebx -> 2
  mov [ebx], eax        ; move the value of eax ( = the address of 0 ) into the address in ebx ( = the address of 2)
  
  loop:                 ; first run     second run
    inc ecx             ; eax -> 0      eax -> 1
    mov ebx, eax        ; ebx -> 0      ebx -> 1
    add eax, 1          ; eax -> 1      eax -> 2
    mov [ebx], eax      ; 0 -> 1            1 -> 2
    cmp ecx, 2          ; ecx = 1 < 2   ecx = 2 == 2
  jl loop

  mov eax, head         ; eax points to the first element
    
  mov   eax,1           ; system call number (sys_exit)
  int   0x80            ; call kernel

This should basically 0. reserve three dwords, the first of which is at the address in head

  1. load the address of 0 into eax, the address of 2 into ebx
  2. mov [ebx], eax write the address of 0 into 2 (2 -> 0)
  3. Repeat the same for fields 0 and 1: 0 -> 1, 1 -> 2
  4. Store the address of head into eax

Now I assemble and run the whole thing using

nasm -f elf -g -F dwarf test.asm
ld -m elf_i386 -o test.out test.o

However, both the values in 0 and 2 are wrong, as I can check using gdb:

gdb test.out
(gdb) b 27 // break after mov eax, head
(gdb) r
(gdb) i r eax
eax     0x804a000    134520832 // eax points to head
(gdb) print *134520832
$1 = 77595137                  // cell 0 does not point to cell 1
(gdb) print *134520833
$2 = 134520834                 // cell 1 does point to cell 2
(gdb) print *134520834
$3 = 134743200                 // cell 2 does not point to cell 1

Where do these wrong values come from?

Maybe this happens because I try to write the entire 32bit eax into the 16bit dword? I tried changing the lines to mov [ebx], ax but ended up with the same result.

Another reason I could come up with was that the memory addresses are bigger than a dword, so I tried to use qwords instead but ended up with another wrong result.

I also tried using the lea instruction as sugested in lea assembly instruction, which lead to the same result.

Can someone help me with this? Thanks in advance

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
writzlpfrimpft
  • 333
  • 3
  • 14

2 Answers2

2

A dword is 32-bit in x86, so 4 bytes. "double word", where a "word" is 16-bit (because x86 evolved out of 16-bit 8086). And yes as you discovered, x86 is byte-addressable, like all modern mainstream ISAs.

Also, the answer to your title question would be mov dword [head], head+4 for example. head+4 is evaluated at assemble+link time and turns into a 32-bit immediate operand holding that address, while head turns into a 32-bit displacement holding the other address.

Or you could use a loop like you're doing, but simplify with mov [eax-4], eax to store the address of the current element into the previous element, with add eax,4 to advance the pointer. No need for copying to EBX, just use an addressing mode for the memory operand for constant offsets.

Write it in C and look at compiler output if you want full examples of whole loops / functions. How to remove "noise" from GCC/clang assembly output?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
0

Well, after reading some more about this, the solution is pretty obvious. Each address corresponds to a single byte. If all my cells are qwords, I need to increase the address in each loop run by 4, not 1:

section .bss
    head resq 4         ; three qwords

section .text
  global _start
    
_start:

    xor eax, eax
    xor ebx, ebx
    xor ecx, ecx        ; counter = 0

    mov eax, head       ; move the address of head into eax: eax -> 0
    mov ebx, eax        ; eax -> 0
    add ebx, 12         ; ebx -> 3
    mov [ebx], eax      ; 3 -> 0

    loop:               ; first run     second run
    inc ecx             ; counter = 0 counter = 1
    mov ebx, eax        ; ebx -> 0      ebx -> 1
    add eax, 4          ; eax -> 1      eax -> 2
    mov [ebx], eax      ; 0 -> 1            1 -> 2
    cmp ecx, 3          ; ecx = 1 < 2   ecx = 2 == 2
    jl loop

    mov eax, head       ; eax points to the first element
    
  mov   eax,1           ; system call number (sys_exit)
  int   0x80            ; call kernel

Addresses are (at least for my setup) up to 32 bits = 4 bytes large, so my first attempt wasn't working because I overwrote each address with the next address written. That's why only the second value was correct - it was the last one that was written.

writzlpfrimpft
  • 333
  • 3
  • 14
  • What ever happened to "I want to write a program which ends up with **three** variables that point to each other" This answer now deals with 4 variables! And why do you reserve 4 **Q**words instead of 3 **D**words? – Sep Roland Dec 27 '20 at 16:39