This is a program that gets passed a String as input.
I'm confused with the assembler code shown below, specifically line 6. This is what i understood from my research:
rbp-48
is a pointer that points to the stack address whereargv
is stored. (argv
itself, is the address pointing to the start of theargv
array)- Now rax register stores the
argv
array address. - We then add 8 bytes to rax. This means rax now points to the address of
argv[1]
. (I understand there is another address stored insideargv[1]
that points to a string). - We then access the value stored in argv[1] and store it in the rdx register. This means, rdx now points to the address were the string begins.
- We then move the [rbp-24] = i counter variable to the eax register.
- We then have an action cdqe which I believe it's not relevant.
And now is were I get confused: If I wanted to access the first character in argv[1]
and store it in eax register, I would expect assembler to do something like:
mov eax, BYTE PTR [rdx]
And if I need to access the second character stored in argv[1] and store it in eax register, I would expect assembler to do something like:
mov eax, BYTE PTR [rdx+1]
But instead, I see the compiler does the following:
add rax, rdx
- Adds the address in memory where the string begins to the address in memory were the address that points to the start of the string is stored, and saves this result in rax.
I can not understand how does this instruction make rax point to any character in argv[1].
Below is the C code and the assembler code corresponding to the loop's instructions:
#include <string.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0;
for(int i = 0; i < strlen(argv[1]); i ++){
sum += (int)argv[1][i];
}
return 0;
}
Assembly
mov rax, QWORD PTR [rbp-48]
add rax, 8
mov rdx, QWORD PTR [rax]
mov eax, DWORD PTR [rbp-24]
cdqe
add rax, rdx
movzx eax, BYTE PTR [rax]
movsx eax, al
add DWORD PTR [rbp-20], eax
add DWORD PTR [rbp-24], 1