.word as used here
ldr r0,hello
nop
nop
nop
hello: .word 0x12345678
is similar to
unsigned int word = 0x12345678
hello: is a label .word has nothing to do with it. Just means I want to use a label for an address at this point, can put code there or data or whatever. Like unsigned int in C you are allocating some space in the program.
.equ though is like a define you are not allocating space, you are simply defining some replacement for that string.
arm-none-eabi-as so.s -o so.o
arm-none-eabi-objdump -D so.o
...
00000000 <hello-0x10>:
0: e59f0008 ldr r0, [pc, #8] ; 10 <hello>
4: e1a00000 nop ; (mov r0, r0)
8: e1a00000 nop ; (mov r0, r0)
c: e1a00000 nop ; (mov r0, r0)
00000010 <hello>:
10: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
If I add this
.equ JELLO, 0x22
ldr r0,hello
nop
nop
nop
hello: .word 0x12345678
No change, if I do this
.equ JELLO, 0x12345678
ldr r0,JELLO
.equ is a define like substitution.
ldr r0,0x12345678
so.s: Assembler messages:
so.s:2: Error: internal_relocation (type: OFFSET_IMM) not fixed up
now if we want the VALUE in r0, then it is a syntax problem.
mov r0,#0x12345678
Which we cant do in ARM, the instructions are either 16 or 32 bits and thats it, so you cant have a 32 bit immediate AND the opcode AND the registers, etc in 32 bits. So something has to give, the immediates depending on the instruction set and variations are between a few and up to maybe 11 or 13 bits, normal arm mov instruction full sized arm instructions like 9 significant bits.
so.s: Assembler messages:
so.s:3: Error: invalid constant (12345678) after fixup
Interestingly the trick is to go back to load and ask the assembler, I would like the address of the label rather than the thing at the label.
ldr r0,=hello
nop
nop
nop
hello: .word 0x12345678
.word 0,1,2,3
The assembler has allocated a word for us where it could find a space
00000000 <hello-0x10>:
0: e59f001c ldr r0, [pc, #28] ; 24 <hello+0x14>
4: e1a00000 nop ; (mov r0, r0)
8: e1a00000 nop ; (mov r0, r0)
c: e1a00000 nop ; (mov r0, r0)
00000010 <hello>:
10: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
14: 00000000 andeq r0, r0, r0
18: 00000001 andeq r0, r0, r1
1c: 00000002 andeq r0, r0, r2
20: 00000003 andeq r0, r0, r3
24: 00000010 andeq r0, r0, r0, lsl r0
Notice it is not loading the thing at address 0x10 into the register, it is loading the thing at address 0x24 the location it added for us, and in that location it is providing the ADDRESS to hello which is 0x10 so r0 will get 0x10 instead of 0x12345678 by adding that equals sign. Kind of like removing the asterisk on a one dimensional pointer in C you get the address of the pointer not the thing the pointer points to.
So knowing this ldr rx,=something means I want the ADDRESS of that label, what if we were to just put an address there instead of a label that represents an address?
ldr r0,=0x87654321
nop
nop
nop
hello: .word 0x12345678
.word 0,1,2,3
it just happens to work
0000000 <hello-0x10>:
0: e59f001c ldr r0, [pc, #28] ; 24 <hello+0x14>
4: e1a00000 nop ; (mov r0, r0)
8: e1a00000 nop ; (mov r0, r0)
c: e1a00000 nop ; (mov r0, r0)
00000010 <hello>:
10: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
14: 00000000 andeq r0, r0, r0
18: 00000001 andeq r0, r0, r1
1c: 00000002 andeq r0, r0, r2
20: 00000003 andeq r0, r0, r3
24: 87654321 strbhi r4, [r5, -r1, lsr #6]!
So this does not justify that you can blindly replace .word with .equ in any gnu assembler based instruction set you want, but due to dumb luck you can almost get there by fixing the syntax
.equ JELLO,0x12345678
ldr r0,=JELLO
nop
nop
nop
and there you go
00000000 <.text>:
0: e59f0008 ldr r0, [pc, #8] ; 10 <JELLO-0x12345668>
4: e1a00000 nop ; (mov r0, r0)
8: e1a00000 nop ; (mov r0, r0)
c: e1a00000 nop ; (mov r0, r0)
10: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
In general for any/other instruction sets .word to .equ with a load you need to change to a mov
.equ JELLO,0x12
ldr r0,hello
mov r0,#JELLO
nop
nop
nop
hello:
.word 0x12
00000000 <hello-0x14>:
0: e59f000c ldr r0, [pc, #12] ; 14 <hello>
4: e3a00012 mov r0, #18
8: e1a00000 nop ; (mov r0, r0)
c: e1a00000 nop ; (mov r0, r0)
10: e1a00000 nop ; (mov r0, r0)
00000014 <hello>:
14: 00000012 andeq r0, r0, r2, lsl r0
using whatever syntax (mov, move, etc. in x86 mov is used for both load and mov immediate so you would just need to know what if any syntax modification is needed to switch from a label to an immediate (remove some form of word ptr hello syntax perhaps))
Note I was using the legacy gnu assembler syntax above, so using that
.equ JELLO,0x12
mov r0,JELLO
gives
so.s: Assembler messages:
so.s:4: Error: immediate expression requires a # prefix -- `mov r0,JELLO'
But if we use the unified syntax, I guess it doesnt care about the pound sign anymore.
.syntax unified
.equ JELLO,0x12
mov r0,JELLO
mov r0,#JELLO
00000000 <.text>:
0: e3a00012 mov r0, #18
4: e3a00012 mov r0, #18
Not a fan, supposed to make it easier (well lazier), just makes it worse...which is why I rarely use it unless I have to...YMMV.