The two cases are the same. Think of labels / symbols in assembly as C extern char symbol[]
- using the bare name gives you the address.
As Nate pointed out in comments, this is true in other contexts as well:
.word 0xdeadbeef @ constant 4-byte value
.word var @ address as 4-byte value
The =
in LDR R1,=0x40021010
is there to tell the assembler it's a pseudo-instruction that should materialize that value in a register, instead of an addressing-mode. ARM doesn't have a [12-bit absolute]
addressing mode AFAIK, but it does have a PC-relative addressing mode which could conceivably make sense for symbol names.
So there is ambiguity, and it's just easier to parse and for humans to read if there's a special character that indicates it's not just an ldr
machine instruction. (It might be an ldr
and assembling extra data into a literal pool, or it might be movw
/movk
depending on the assembler and target options.)
In some assemblers for other ISAs, there are contexts where that's not true, notably MASM for x86. But other x86 assemblers like NASM are more consistent, and mov edi, symbol
uses the address as an immediate, vs. mov eax, [symbol]
is required to load from it. Why in NASM do we have to use square brackets ([ ]) to MOV to memory location?
In AT&T syntax for x86, symbols and literal numbers with no decoration are treated the same, but as memory operands. mov 123, %eax
is a load from absolute address 123
, same as mov foo, %eax
is a load from the address of the symbol foo
, using a [disp32]
addressing mode. (To mov-immediate a literal number or a symbol address, mov $123, %eax
or mov $foo, %eax
.) In other contexts, like .long foo
, symbol names are addresses, because there's nothing else you could do with them that the assembler needs to disambiguate.