You did observe the behaviour well, and you are (mostly?) correct about them.
movl $array1, %eax
vs movl array1, %eax
: yes, first one will load eax
with the memory address, second one will load eax
with 32 bit value from memory (from that address).
I'm having some trouble understading how actually registers store stuff.
The general purpose registers like eax
are 32 bit registers (on modern x86 CPU supporting 64 bit the eax
is the low-32 bit part of rax
, which is 64 bit register). That means, that the register contains 32 bit values (either 0 or 1). Nothing else. The debuggers, unless you switch it to different interpretation, will usually display values as 32 bit unsigned hexadecimal integer, because from output like hexadecimal 1234ABCD
you can read the particular bit pattern in head (each hexadecimal digit is exactly 4 bits, i.e. B
= 11 = 8+2+1 = 1011 binary), but that doesn't mean the register contains hexadecimal value, the register is only 32 bits, and you can interpret them any way you (or the code) wish.
To access array elements with index i
you can pick from different techniques, in your task of summing arrays I would probably stay with your original code using memory addresses directly onto elements, but then you need one more register to load the actual value, i.e.:
# loop initialization
movl $array1, %eax # eax = array1 pointer
movl $array2, %ebx # ebx = array2 pointer
# TODO: set up also some counter or end address
loop_body:
# array1[i] += array2[i];
movl (%ebx), %edx # load value array2[i] from memory into edx
addl %edx, (%eax) # add edx to the array1[i] (value in memory at address eax)
# advance array1 and array2 pointers (like ++i;)
addl $4, %eax
addl $4, %ebx
# TODO: do some loop termination condition and loop
This allows for simple body loop code, and to provide the same summing code with different arrays to sum.
Other options
You can avoid the need of register with memory address by encoding it directly into the memory accessing instructions, like:
# loop initialization
xorl %ecx, %ecx # ecx = 0 (index + counter)
loop_body:
# array1[i] += array2[i];
movl array2(,%ecx,4), %eax # load value array2[i] from memory into eax
addl %eax, array1(,%ecx,4) # add eax to the array1[i]
incl %ecx # ++i
# TODO: do some loop termination condition and loop
But this code can't be redirected to different arrays.
Or you can use array addresses in registers, but avoid their modification, by using the index register addressing:
# loop initialization
movl $array1, %eax # eax = array1 pointer
movl $array2, %ebx # ebx = array2 pointer
xorl %ecx, %ecx # ecx = 0 (index + counter)
loop_body:
# array1[i] += array2[i];
movl (%ebx,%ecx,4), %edx # load value array2[i] from memory into edx
addl %edx, (%eax,%ecx,4) # add edx to the array1[i]
incl %ecx # ++i
# TODO: do some loop termination condition and loop
This may make sense, if you did plan to use index value anyway, so you need plain i
, and you plan to use array addresses later too, so not modifying them is handy, etc...
There are other ways how to access values in memory, but the above ones are most straightforward for somebody learning x86 assembly.
Keep in mind in assembly there are no variables or arrays, etc.. the computer memory is like one huge array without name, having indices from 0 to N-1 (N = size of physical memory), and on each index there's single byte available (8 bits of information).
Registers are like 8/16/32/64 bit of information available directly on the CPU chip, so the CPU doesn't need to know address (the name "eax" is like address), and doesn't need to contact the memory chip for value (so registers are faster than memory).
To contact memory in AT&T syntax you have to write something in the form of: displacement(base_reg, index_reg, scale)
, see this question with details: A couple of questions about [base + index*scale + disp]