1

I've been following this tutorial but I got stuck. He starts explaining at about 6:30.
So there is a for loop in that episode that looks like this. And I got lost during the explanation as he said. But I rewatched it a couple of times and there is one thing I don't understand. So this is the code and I took some notes while watching the registers.
mov rax, [rbp+arg]
So after this line rax = 0x7ffe63c2d498 and arg = 0x7ffe63c2d380 . I decoded these two but nothing comes up so I'm assuming they are pointers.
add rax, 8
mov rdx, [rax]
After this line rdx = 0x7ffe63c2e09d . I'm assuming this is a pointer again.
mov eax, [rbp+i]
This moves the value of i in eax for example 0x01
cdqe
add rax, rdx
Here you add the value of i to the pointer of the string.
movzx eax, byte ptr [rax]
Here you move the character that the rax was pointing to into eax.
movsx eax, al
add [rbp+sum], eax
add [rbp+i], 1
Here you sum up and i++.

My question is: aren't [] supposed to move the value that the address points to into the register? So it moves the value arg is pointing to into rax and then the value rax is pointing to into rdx. But both of these are pointers. How come? So arg is a pointer to a pointer to a pointer?

Jester
  • 56,577
  • 4
  • 81
  • 125
Gregor Trplan
  • 41
  • 1
  • 4
  • 1
    Yes, as you can clearly see and should also know, `argv` has type `char*[]` and the compiler decided to copy it to the stack which is accessed through a pointer again. – Jester Feb 20 '18 at 12:58
  • assembler doesn't have data types in the sense that `C` does. There are no pointers in assembler. There are only registers and immediate operands. The instruction dictates how the value of a register or immediate is interpreted by that instruction. – bolov Feb 20 '18 at 13:02
  • Ah so arg is the pointer to the argv array? Then the char* argv[1] pointer is copied into rdx? And then it searches to the index of the char* and gets the character? – Gregor Trplan Feb 20 '18 at 13:04
  • 1
    @bolov but you can still think of them as pointers? They point to a value that is accessed with []? – Gregor Trplan Feb 20 '18 at 13:06
  • `arg` is an offset relative to `rbp` which accesses the copy of the `argv` on the stack. – Jester Feb 20 '18 at 13:07
  • @GregorTrplan sort of. Only in the context of an instruction. The instruction gives meaning to the value. For instance `mov eax rdi`. Here the value of `rdi` is just a number. It doesn't have any meaning. But in `mov eax [rdi]` here `rdi` is interpreted as an address in memory. – bolov Feb 20 '18 at 13:09
  • Ah so the actual argv is on the stack and you can access it via rbp? But the things after that I wrote in the previous comment are correct? – Gregor Trplan Feb 20 '18 at 13:10
  • @bolov aha thanks for clearing up that. I'm quite new to asm so I don't understand a lot of things. – Gregor Trplan Feb 20 '18 at 13:12

1 Answers1

1

Assembly doesn't have data types in the sense that C does. There are no pointers in assembly. There are only registers and immediate operands. The instruction dictates how the value of a register or immediate is interpreted by that instruction.

but you can still think of them as pointers? They point to a value that is accessed with []?

Sort of. Only in the context of an instruction and only for you.

For instance mov eax rdi. Here the value of rdi is interpreted just like a number. For you, who are trying to understand the algorithm it could mean a counter, or a sum, or an offset, or a pointer. For the instruction however it's just a number.

But in mov eax [rdi] here rdi is interpreted as an address in memory.

In lea eax, [rsi + rdi] here the value of rsi + rdi is interpreted as a memory address. But for you this instruction just computes rsi + rdi so it really could mean anything to you, the sum of a pointer and an offset, or the sum of two integers. But that is just the meaning you put to them to understand the algorithm.

To answer your question [OP] means "the value found in memory at address OP".

lea eax, [rsi + rdi] means "load in eax the effective address of the value found in memory at address rsi + rdi" which is just rsi + rdi

bolov
  • 72,283
  • 15
  • 145
  • 224
  • Ok I kind of understand but I still have some questions. So in `mov eax [rdi]` the `rdi` is interpreted as a memory address and the value it points to is moved into `eax`? Also in the last paragraph something is unclear. Is `lea eax, [rsi + rdi]` the same as `add rsi, rdi` `lea eax rsi` or `lea eax [rsi]`? Because you said it just computes `rsi + rdi`. Doesn't it also treat it as an address and copies the value it points to to `eax`? – Gregor Trplan Feb 20 '18 at 13:29
  • 1
    @GregorTrplan `lea` is the one exception. `lea` computes the address and then writes the address to the other operand. No memory is accessed. – fuz Feb 20 '18 at 13:34
  • https://stackoverflow.com/questions/1658294/whats-the-purpose-of-the-lea-instruction – bolov Feb 20 '18 at 13:36
  • I tried to read the question but it kinda makes my head hurt. As I said why not use the add and mov instruction instead of lea? – Gregor Trplan Feb 20 '18 at 13:43
  • @GregorTrplan that is answered very well in the question I linked. Read the 2nd most voted answer. – bolov Feb 20 '18 at 13:44
  • if you want to learn assembly use https://godbolt.org/ a lot. Like a lot a lot. :) – bolov Feb 20 '18 at 13:54
  • @Gregor - In addition to the linked answer: At the time when LEA was introduced it was performed by a separate "address generation unit", freeing up the main part of the CPU to start working on the next instruction. Another bonus. – Bo Persson Feb 20 '18 at 15:02