I read the chapter 3 of the textbook "Computer Systems: A Programmer Perspective". Chapter 3 is about introducing instruction set architecture(ISA) of x86-64. In this chapter, there are several examples of assembly code that were compiled from C programs. In these examples, I encountered two questions which are about the address of structure.
First, suppose the starting address of integer array E and integer index i are stored in register %rdx and %rcx, respectively. The following shows an assembly-code implementation of each expression:
Expression | Type | Value | Assembly code |
---|---|---|---|
E | int * | XE | movl %rdx, %rax |
E[0] | int | M[XE] | movl (%rdx), %eax |
&E[2] | int * | XE+8 | leaq 8(%rdx), %rax |
&E[i]-E | long | i | movq %rcx, %rax |
Why the expression &E[i]-E
be the long
type instead of the int *
type? Why the value of &E[i]-E
be i
instead of 4i
?
Second, suppose a structure:
typedef union {
struct{
long u;
short v;
char w;
}t1;
struct{
int a[2];
char *p;
}t2;
} u_type;
You write a series of functions of the form
void get(u_type *up, type *dest){
*dest = expr;
}
with different access expression expr and with destination data type type set according to type associated with expr. Suppose in these functions that up and dest are loaded into register %rdi and %rsi, respectively.The following table is about the assembly code of the expression:
Expression | Type | Assembly code |
---|---|---|
up->t1.u | long | movq (%rdi), %rax \n movq %rax, (%rsi) |
&up->t1.w | char * | addq $10, %rdi \n movq %rdi, (%rsi) |
Considering expression &up->t1.w
, I know that we want to assign the the address of the char w
to dest
and the offset of the char w
is 10 bytes of the beginning of the structure. But why the answer is to store 10 to dest
instead of storing 10(%rdi)
to dest
? If the answer is correct, why the type is char *
instead of long
?
To make it clear, why the assembly code of the expression &up->t1.w
not like this:
leaq 10(%rdi), %rax
movq %rax, (%rsi)