There are likely countless ways to do this, but hopefully this gives you some ammo to start with to start to see what is going on.
boot.s
.thumb
.globl _start
_start:
reset:
mov r0,pc
ldr r1,=0xFFFF0000
and r0,r1
ldr r1,gotbase
add r0,r1
bl centry
b .
.align
gotbase:
.word _GLOBAL_OFFSET_TABLE_-(_start)
.word _start
.word _GLOBAL_OFFSET_TABLE_
.word _GLOBAL_OFFSET_TABLE_
so.c
extern unsigned int fun ( unsigned int );
unsigned int x;
unsigned int y;
unsigned int z;
void centry ( void )
{
x=5;
y=6;
z=fun(77);
}
fun.c
unsigned int fun ( unsigned int x )
{
return(x+3);
}
flash.ld
MEMORY
{
rom : ORIGIN = 0x08020000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
.bss : { *(.bss*) } > ram
}
build
arm-none-eabi-as --warn boot.s -o boot.o
arm-none-eabi-gcc -Wall -O2 -mthumb -fpic -mthumb -c so.c -o so.o
arm-none-eabi-gcc -Wall -O2 -mthumb -fpic -mthumb -c fun.c -o fun.o
arm-none-eabi-ld -o so.elf -T flash.ld boot.o so.o fun.o
arm-none-eabi-objdump -D so.elf > so.list
arm-none-eabi-objcopy --srec-forceS3 so.elf -O srec so.srec
arm-none-eabi-objcopy so.elf so.bin -O binary
disassemble
Disassembly of section .text:
08020000 <_start>:
8020000: 4678 mov r0, pc
8020002: 4907 ldr r1, [pc, #28] ; (8020020 <gotbase+0x10>)
8020004: 4008 ands r0, r1
8020006: 4902 ldr r1, [pc, #8] ; (8020010 <gotbase>)
8020008: 1840 adds r0, r0, r1
802000a: f000 f80b bl 8020024 <centry>
802000e: e7fe b.n 802000e <_start+0xe>
08020010 <gotbase>:
8020010: 00000060
8020014: 08020000
8020018: 00000048
802001c: 00000044
8020020: ffff0000
08020024 <centry>:
8020024: 2205 movs r2, #5
8020026: b510 push {r4, lr}
8020028: 4c08 ldr r4, [pc, #32] ; (802004c <centry+0x28>)
802002a: 4b09 ldr r3, [pc, #36] ; (8020050 <centry+0x2c>)
802002c: 447c add r4, pc
802002e: 58e3 ldr r3, [r4, r3]
8020030: 601a str r2, [r3, #0]
8020032: 4b08 ldr r3, [pc, #32] ; (8020054 <centry+0x30>)
8020034: 58e3 ldr r3, [r4, r3]
8020036: 3201 adds r2, #1
8020038: 204d movs r0, #77 ; 0x4d
802003a: 601a str r2, [r3, #0]
802003c: f000 f80e bl 802005c <fun>
8020040: 4b05 ldr r3, [pc, #20] ; (8020058 <centry+0x34>)
8020042: 58e3 ldr r3, [r4, r3]
8020044: 6018 str r0, [r3, #0]
8020046: bc10 pop {r4}
8020048: bc01 pop {r0}
802004a: 4700 bx r0
802004c: 00000030
8020050: 00000000
8020054: 00000008
8020058: 00000004
0802005c <fun>:
802005c: 3003 adds r0, #3
802005e: 4770 bx lr
Disassembly of section .got:
08020060 <.got>:
8020060: 20000000
8020064: 20000004
8020068: 20000008
Disassembly of section .got.plt:
0802006c <_GLOBAL_OFFSET_TABLE_>:
...
Disassembly of section .bss:
20000000 <x>:
20000000: 00000000
20000004 <z>:
20000004: 00000000
20000008 <y>:
20000008: 00000000
This is intentionally somewhat, well a lot, minimal. The first and most interesting item with respect to position independence is this:
Disassembly of section .got:
08020060 <.got>:
8020060: 20000000
8020064: 20000004
8020068: 20000008
And what this clearly is is the three global data items that we have in the program. Add more items you will see this change.
if you change the addresses in the linker script
rom : ORIGIN = 0x08010000, LENGTH = 0x1000
ram : ORIGIN = 0x30000000, LENGTH = 0x1000
at least for this simple program with the tools I used, the machine code doesnt change (it is technically possible with optimizations, but is assumed to be position independent so you should not care about the code location with reason) but the got does to reflect the 0x30000000 address.
Being all thumb (probably doesnt matter) and all built position independent and flashes being relatively small (compared to the range of the branch and branch link instructions) the linker shouldnt have any issues or magic making relative branches, so no program counter math, I would hope, although if you really really tried I bet you could make it happen and I would assume, but you will find if you push it, if that is the case. And likely if you push it maybe you will end up with .text or other based offsets here:
Disassembly of section .got.plt:
0802006c <_GLOBAL_OFFSET_TABLE_>:
...
So if your alternate locations for your programs also include alternate locations for the data, then you need to patch up the global offset table.
My bootstrap is more than minimal, was messing around with one brute force way to get at the address of the GOT. There are no doubt linker script and/or ghee whiz code ways to get at this information. Likewise you can likely use the linker script to force/place the GOT.
8020024: 2205 movs r2, #5
8020028: 4c08 ldr r4, [pc, #32] ; (802004c <centry+0x28>)
802002c: 447c add r4, pc
8020032: 4b08 ldr r3, [pc, #32] ; (8020054 <centry+0x30>)
8020034: 58e3 ldr r3, [r4, r3]
8020036: 3201 adds r2, #1
802003a: 601a str r2, [r3, #0]
these items are doing a position independent version of y = 6; they compute the offset to the got, then an offset into that and use that to address the memory location, remove the pic command line options and see how this changes.
So same machine code dependent on the got for the actual address for the item.
if we have 0x20000004 then that is where y lives if the table has 0x30000004 then that is where y lives.
as coded and built above
08020060 <.got>:
8020060: 20000000
8020064: 20000004
8020068: 20000008
the table lives in flash, so the code you use to place this program in flash will need to patch this table as it writes to the flash. if you play linker games to put the table in ram but in the flash on the way there like .data that are bytes in ram that go somewhere, but are in flash then copied by the bootstrap, combination of bootstrap and linker script code.
In either case I dont see how it is possible that any stock bootstrap would know where you want this located. .text code is assumed to be relocatable with some alignment assumptions, but .data and other are assumed to be not in the same memory space and do not move linearly with .text (cant just take some linked .text address and discovered .text address and adjust the data offsets by that amount, as the two are assumed to be separate and are in this case (cortex-m microcontroller))
So I would (other than answers here have never had a need to mess with this) start with the ammo above be it simply knowing how to disassemble and read the code and/or some of the code above, and examine what the tools are building for you, where they are putting things. I would assume that you are building the program itself (.text if you will) as one big blob so the tools should build that all with relative addressing for the accesses within that section. If you didnt write the bootstrap then it is not a toolchain thing it is a C library or other (RTOS, HAL, etc) thing, and I would not expect this to have position independent global offset table patchup code as how would the bootstrap know where you want the the .data/.bss to move? Ponder that for a bit. Even worse in this case if the GOT is in flash, then it has to be patched up BEFORE execution, not during so some other program has to do this. Which is probably why the elf file format contains the location/size of the GOT so that the loader, the operating system or other tool that loads this program before running it. can find and patch this up.
If you want your programs to be loaded into one of two different flash spaces then you need to solve this with whatever is loading. Again very crude brute force way to start:
.thumb
.globl _start
_start:
reset:
b skip
.align
.word _GLOBAL_OFFSET_TABLE_-(_start)
skip:
08020000 <_start>:
8020000: e002 b.n 8020008 <skip>
8020002: 46c0 nop ; (mov r8, r8)
8020004: 00000068 andeq r0, r0, r8, rrx
08020008 <skip>:
your loader can find the offset in a place that you left (offset 0x4 into the binary), but of course you need to figure out (linker magic) and place the size of the global offset table in a known place. that or support full blown elf or other format files and parse through those.
EDIT:
08020068 <.got>:
8020068: 20000000
802006c: 20000004
8020070: 20000008
08020068 <.got>:
8020068: 30000000
802006c: 30000004
8020070: 30000008
The GOT doesnt/cannot move it has to be pc relative so the code as compiled in .text can find it. What it points to which is the whole idea has to be modified if/when you move where you want those items. So as shown above if it were linked such that x,y,z are at 0x20000000, 0x20000004, 0x20000008. But if you wish to run with x,y,z at 0x30000000, relocate basically, then you need to modify the GOT itself to point at those items just like the second example above. Because the got is in .text so that it is pc relative to the code (as shown above how the code is constructed with the command line option I used) and this is a mcu and if you have .text in flash, then the got has to be modified before being placed in flash which is not at runtime for this code. So your bootloader or whatever places this program in flash would need to do that patch, IF you want .data/.bss to be somewhere other than linked.
08020068 <.got>:
8020068: 20000000
802006c: 20000004
8020070: 20000008
08020068 <.got>:
8020068: 30000000
802006c: 30000004
8020070: 30000008
The GOT doesnt/cannot move it has to be pc relative so the code as compiled in .text can find it. What it points to which is the whole idea has to be modified if/when you move where you want those items. So as shown above if it were linked such that x,y,z are at 0x20000000, 0x20000004, 0x20000008. But if you wish to run with x,y,z at 0x30000000, relocate basically, then you need to modify the GOT itself to point at those items just like the second example above. Because the got is in .text so that it is pc relative to the code (as shown above how the code is constructed with the command line option I used) and this is a mcu and if you have .text in flash, then the got has to be modified before being placed in flash which is not at runtime for this code. So your bootloader or whatever places this program in flash would need to do that patch, IF you want .data/.bss to be somewhere other than linked.
Using your compiler flags
Disassembly of section .text:
08020000 <_start>:
8020000: e002 b.n 8020008 <skip>
8020002: 46c0 nop ; (mov r8, r8)
8020004: 00000064 andeq r0, r0, r4, rrx
08020008 <skip>:
8020008: 4678 mov r0, pc
802000a: 4907 ldr r1, [pc, #28] ; (8020028 <gotbase+0x10>)
802000c: 4008 ands r0, r1
802000e: 4902 ldr r1, [pc, #8] ; (8020018 <gotbase>)
8020010: 1840 adds r0, r0, r1
8020012: f000 f80b bl 802002c <centry>
8020016: e7fe b.n 8020016 <skip+0xe>
08020018 <gotbase>:
8020018: 00000064 andeq r0, r0, r4, rrx
802001c: 08020000 stmdaeq r2, {} ; <UNPREDICTABLE>
8020020: 00000044 andeq r0, r0, r4, asr #32
8020024: 00000040 andeq r0, r0, r0, asr #32
8020028: ffff0000 ; <UNDEFINED> instruction: 0xffff0000
0802002c <centry>:
802002c: b510 push {r4, lr}
802002e: 4654 mov r4, r10
8020030: 2205 movs r2, #5
8020032: 4b08 ldr r3, [pc, #32] ; (8020054 <centry+0x28>)
8020034: 58e3 ldr r3, [r4, r3]
8020036: 601a str r2, [r3, #0]
8020038: 4b07 ldr r3, [pc, #28] ; (8020058 <centry+0x2c>)
802003a: 58e3 ldr r3, [r4, r3]
802003c: 3201 adds r2, #1
802003e: 204d movs r0, #77 ; 0x4d
8020040: 601a str r2, [r3, #0]
8020042: f000 f80d bl 8020060 <fun>
8020046: 4b05 ldr r3, [pc, #20] ; (802005c <centry+0x30>)
8020048: 58e3 ldr r3, [r4, r3]
802004a: 6018 str r0, [r3, #0]
802004c: bc10 pop {r4}
802004e: bc01 pop {r0}
8020050: 4700 bx r0
8020052: 46c0 nop ; (mov r8, r8)
8020054: 00000000 andeq r0, r0, r0
8020058: 00000008 andeq r0, r0, r8
802005c: 00000004 andeq r0, r0, r4
08020060 <fun>:
8020060: 3003 adds r0, #3
8020062: 4770 bx lr
Disassembly of section .got:
08020064 <.got>:
8020064: 20000000 andcs r0, r0, r0
8020068: 20000004 andcs r0, r0, r4
802006c: 20000008 andcs r0, r0, r8
which changes it to use this
802002e: 4654 mov r4, r10
but notice that the tools do not set r10, you have to add code to point r10 to the GOT for this to work at all even at the linked address.
So again the GOT itself does not move, thats the whole point, the contents change to point at the relocated location for .data/.bss. See the last example above the GOT is in the same place but the addresses for x,y,z have changed to reflect their new location. The disassembly shows linked address based addresses, but if you link for a different address, and compare just the machine code you will see it doesnt change, the instructions used are pc-relative.