4

I have a small c code file(function.c):

int function()
{
    return 0x1234abce;
}

I am using a 64 bit machine. However, I want to write a small 32 bit OS. I want to compile the code into a 'pure' assembly/binary file.

I compile my code with:

gcc function.c -c -m32 -o file.o -ffreestanding # This gives you the object file

I link it with:

ld -o function.bin -m elf_i386 -Ttext 0x0 --oformat binary function.o

I am getting the following error:

function.o: In function `function':
function.c:(.text+0x9): undefined reference to `_GLOBAL_OFFSET_TABLE_'
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Suraaj K S
  • 600
  • 3
  • 21
  • There's a section about Global Offset Table in https://wiki.osdev.org/Dynamic_Linker, which might help you understanding the underlying what and why. – PypeBros Nov 27 '19 at 12:52

2 Answers2

4

You need -fno-pie; the default (in most modern distros) is -fpie: generate code for a position-independent executable. This is a code-gen option separate from the -pie linker option (which gcc also passes by default), and is independent of -ffreestanding. -fpie -ffreestanding implies you want a freestanding PIE that uses a GOT, so that's what GCC targets.

-fpie only costs a bit of speed in 64-bit code (where RIP-relative addressing is possible) but is quite bad for 32-bit code; compilers get a pointer to the GOT in one of the integer registers (tying up another one of the 8) and access static data relative to that address with [reg + disp32] addressing modes like [eax + foo@GOTOFF]


With optimization disabled, gcc -fpie -m32 generates the address of the GOT in a register even though the function doesn't access any static data. You'd can see this if you look at your compiler output (with gcc -S instead of -c on the machine you're compiling on).

On Godbolt we can use -m32 -fpie to give the same effect as a GCC configured with --enable-default-pie:

# gcc9.2 -O0 -m32 -fpie
function():
        push    ebp
        mov     ebp, esp                        # frame pointer
        call    __x86.get_pc_thunk.ax
        add     eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_  # EAX points to the GOT
        mov     eax, 305441742                  # overwrite with the return value
        pop     ebp
        ret

__x86.get_pc_thunk.ax:          # this is the helper function gcc calls
        mov     eax, DWORD PTR [esp]
        ret

The "thunk" returns its return address. i.e. the address of the instruction after the call. The .ax name means to return in EAX. Modern GCC can choose any register; traditionally the 32-bit PIC base register was always EBX but modern GCC chooses a call-clobbered register when that avoids an extra save/restore of EBX.

Fun fact: call +0; pop eax would be more efficient, and only 1 byte larger at each call site. You might think that would unbalance the return-address predictor stack, but in fact call +0 is special-cased on most CPUs to not do that. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. (call +0 means the rel32 = 0, so it calls the next instruction. That's not how NASM would interpret that syntax, though.)

clang doesn't generate a GOT pointer unless it needs one, even at -O0. But it does so with call +0;pop %eax: https://godbolt.org/z/GFY9Ht

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • @SuraajKS: you can let everyone know that by clicking the accept checkbox under the vote arrows. – Peter Cordes Nov 27 '19 at 12:49
  • PeterCordes, I was looking back at this answer.... Consider a simple 16 bit assembly snippet like this: * bits 16 ; call +0; pop ax; jmp $ ; * . I emulate using qemu. The program crashes... Maybe instead of call +0, you meant ```call nextLabel ; nextLabel: ``` ? – Suraaj K S Jan 19 '20 at 15:58
  • @SuraajKS: `call +0` is intended to indicate a `rel32`=0 in the actual machine-code encoding. But if you assemble it with NASM `+0` is absolute address 0. `call $+5` would be one way to write it for NASM in 32/64-bit mode, or with labels like you showed. But that doesn't highlight the fact that the actual adjustment to EIP/RIP is 0. So once you understand what `call +0` means, it's very nice notation. Maybe I need to add some explanation to the answer if that wasn't obvious to everyone else, though. – Peter Cordes Jan 19 '20 at 20:40
  • Aha... Understood @PeterCordes – Suraaj K S Jan 21 '20 at 17:55
3

By default, your compiler creates a position-independant executable.

You can force your compiler to build a non-pie executable by passing the option -fno-pie.

Maxime B.
  • 1,116
  • 8
  • 21