string+8
isn't a based-index addressing mode. It assembles to a disp32 absolute address with no base register. The +8 is resolved at assemble/link time. (See Referencing the contents of a memory location. (x86 addressing modes))
movzbl string+8, %eax
assembles to machine code with the same addressing mode (ModR/M byte) as movzbl string, %eax
, just a different disp32
displacement. See How does C++ linking work in practice? for some details about how assembling + linking take care of the +8 so there's no extra work at run time.
You can do this, because string+8
isn't an addressing mode, it's a link-time constant that you can use as an immediate operand.
mov $string+8, %edx
movzbl (%edx), %eax
Using mov
instead of lea
makes this point clear, IMO. The only reason to use lea
for putting a static address into a register is in x86-64 when you can use it for RIP-relative addressing for position-independent code (or for code outside the low 2 GiB, like on OS X). e.g. lea string+8(%rip), %rdx
.
The most over-complicated way to do the most useless stuff at run-time instead of assemble time would be
mov $string, %edx
add $8, %edx
movzbl (%edx), %eax
I guess using lea
would be even more over-complicated, or you could inc
8 times, or write a loop to inc
8 times, but that's over-complicated in a different way.
For example, given this source:
.globl _start
_start:
mov $string, %eax
mov $string+8, %eax
movzbl string+8, %eax
.section .rodata
string:
I assembled with gcc -m32 foo.S -c
and disassembled with objdump -drwC foo.o
(the option -r
shows relocations):
foo.o: file format elf32-i386
Disassembly of section .text:
00000000 <_start>:
0: b8 00 00 00 00 mov $0x0,%eax 1: R_386_32 .rodata
5: b8 08 00 00 00 mov $0x8,%eax 6: R_386_32 .rodata
a: 0f b6 05 08 00 00 00 movzbl 0x8,%eax d: R_386_32 .rodata
Instead of real addresses, the 0 and 0x8 placeholders are the offsets from the symbol value for that relocation. They're against the .rodata
section of the object file rather than string
because I didn't use .globl _string
to make that symbol global.
If I assemble+link with gcc -static -m32 -nostdlib foo.S
and disassemble, I get:
8048098: b8 a9 80 04 08 mov $0x80480a9,%eax
804809d: b8 b1 80 04 08 mov $0x80480b1,%eax
80480a2: 0f b6 05 b1 80 04 08 movzbl 0x80480b1,%eax
Notice how the absolute address to load from is right there in the last 4 bytes of the movzbl
(in little-endian), the same 4-byte value that's an immediate for the b8
opcode (mov-imm32-to-eax).
Also notice how string
and string+8
just result in different address bytes but the same opcode.