BL instruction ARM - How does it work

Question

I am learning ARM Assembly, and I am stuck on something right now.

I know about the Link Register, which if I'm not wrong holds the address to return to when a function call completes.

So if we have something like that (taken from the ARM documentation):

0 | here
1 |   B there
2 |   
3 |   CMP R1, #0
4 |   BEQ anotherfunc
5 |
6 |   BL sub+rom ;  Call subroutine at computed address.

So, if we think of the column at the left as addresses of each instruction, then after the B there at address 1, the Link Register holds the value of 1 right?

Then the program goes to the method there and then it uses the value of the Link Register to know where to return.

If we skip to address 6 now, where I am stuck, we know what BL copies the address of the next instruction into lr (r14, the link register).

So now it would copy the address of sub which is a subroutine (what is a subroutine??) + rom (which is a number?) or the address of sub+rom (I don't know what this could be).

But in general, when would we need BL? Why do we want it in the example above? Can someone give me an example where we would really need it?

Thanks!

_"when would we need BL?"_ If you write a function that can be called from more than one place it will need to know to where it's supposed to return, which is what the link register is used for. And to set the link register to the correct address to return to you use the `BL` instruction. — Michael, Dec 04 '15 at 15:40

ElderBug · Accepted Answer · 2015-12-04T17:52:47.180

42

It seems there is a bit of confusion. Here is an explanation :

The B instruction will branch. It jumps to another instruction, and there is no return expected. The Link Register (LR) is not touched.

The BL instruction will branch, but also link. LR will be loaded with the address of the instruction after BL in memory, not the instruction executed after BL. It will then be possible to return from the branch using LR.

Example :

start:
01:  MOV r0, r2     ; some instruction
02:  B there        ; go there and never return !

there:
11:  MOV r1, r0         ; some instruction
12:  BL some_function   ; go to some_function, but hope to return !
                        ; this BL will load 13 into LR
13:  MOV r5, r0
14:  BL some_function   ; this BL will load 15 into LR
15:  MOV r6, r0


some_function:
     MOV r0, #3
     B LR               ; here, we go back to where we were before

If you want to call another function inside a function, LR will be overwritten, so you won't be able to return. The common solution is to save LR on the stack with PUSH {LR}, and restore it before returning with POP {LR}. You can even restore and return in a single POP {PC} : this will restore the value of LR, but in the program counter, effectively returning of the function.

edited Dec 04 '15 at 17:52

answered Dec 04 '15 at 16:32

ElderBug

5,926
16
25

The called subroutine may store the link register on the stack, in which case, it can return by moving from the stack to PC (R15). Unlike subroutines, some exceptions result in LR pointing to the instruction that caused the exception, as opposed to what would be a return address. – rcgldr Dec 04 '15 at 16:39
@rcgldr About the exceptions, they work just like interrupts, and they actually store the return address to LR, and thus you can return from exceptions. The instruction that caused the exception is the instruction just before. – ElderBug Dec 04 '15 at 16:41
@rcgldr Or maybe I'm wrong ? I checked that recently but now I doubt, do you have a source ? – ElderBug Dec 04 '15 at 16:44
2

@rcgldr Okay I found it, and you are right : _"The actual location pointed to by the program counter when an exception is taken depends on the exception type. The return address may not necessarily be the next instruction pointed to by the program counter."_ But it is still a return address that can be used to return. – ElderBug Dec 04 '15 at 16:49
Could you add something to your answer about how non-leaf functions work? In x86, `call/ret` push/pop return addresses on the stack. I guess ISAs that use a link register, like ARM, make non-leaf functions manually save it? Actually, I just checked for myself on godbolt: http://goo.gl/jgnt0T. `push {r3, lr} ... pop {r3, pc}`. So I guess `r3` is the stack pointer. Nifty that you can just pop into PC. I still have no idea what most of the PowerPC asm for that trivial non-tailcall case does, since PPC mnemonics seem much harder to figure out from context / without reading docs. – Peter Cordes Dec 04 '15 at 17:33
1

@PeterCordes Poping into PC is weird (but convenient), but I learned that you can also ADD, SUB or even AND (and plenty others) into PC, like `AND pc, pc, r2` (edit: actually deprecated). About your check on godbolt ... Actually, r3 isn't even the stack pointer (it is sp, alias r13). I have no idea why GCC push r3. By convention r3 is volatile. If you use `foo(int i)` and `return i;`, it uses r4 (not volatile), which make sense. – ElderBug Dec 04 '15 at 18:04
Thanks for the comments on gcc's output. I had to look up ARM's `pop` instruction. I didn't realize that was a *list* of registers to push/pop. I thought you got to choose which register to use as the stack pointer, so you could use it for your own stack data structures or something. (I don't write arm asm, I just sometimes try to read it, esp. for c++ std:atomic code to see what happens on a weakly ordered ISA). x86 has single-byte instructions for `push reg`, so I hadn't even thought of the possibility of a variable-sized push or pop. – Peter Cordes Dec 04 '15 at 18:21
1

@ElderBug: I think I know why gcc pushes a scratch reg along with `lr`, for my test code: It's keeping the stack aligned to an 8B boundary. Also interesting: http://stackoverflow.com/questions/27693979/arm-assembly-retne-instruction – Peter Cordes Dec 04 '15 at 19:03
@ElderBug - for the exceptions that point LR to the instruction that failed, the idea is to fix the failing condition and re-issue the instruction. The alternative is to emulate the failing instruction, in which case, the failing instruction is skipped and the return is made to next instruction, LR+2 in thumb mode, LR+4 in arm mode. – rcgldr Dec 04 '15 at 19:11
@PeterCordes Indeed it seems to be the case, that makes sense now. – ElderBug Dec 04 '15 at 19:14
What about the instruction BL exit, where the program doesn't have a subroutine exit? The same with other subroutines. I just saw some programs like that. – pavlos163 Dec 04 '15 at 20:13
@pk163 `exit` is a standard C function from stdlib. Maybe that was referencing this ? I can't really say without more context. – ElderBug Dec 04 '15 at 20:25

BL instruction ARM - How does it work

1 Answers1

Linked