20

As an exercise to learn more precisely how c programs work and what minimum level of content must exist for a program to be able to use libc, I've taken it upon myself to attempt to program primarily in x86 assembly using gas and ld.

As a fun little challenge, I've successfully assembled and linked several programs linked to different self-made dynamic libraries, but I have failed to be able to code a program from scratch to use libc function calls without directly using gcc.

I understand the calling conventions of individual c library functions, and have thoroughly inspected programs compiled out of gcc through use of objdump and readelf, but haven't gotten anywhere as far as what information to include in a gas assembly file and what parameters to invoke in ld to successfully link to libc. Anyone have any insight to this?

I'm running Linux, on an x86 machine.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Cyro
  • 203
  • 1
  • 2
  • 4

4 Answers4

25

There are at least three things that you need to do to successfully use libc with dynamic linking:

  1. Link /usr/lib/crt1.o, which contains _start, which will be the entry point for the ELF binary;
  2. Link /usr/lib/crti.o (before libc) and /usr/lib/crtn.o (after), which provide some initialisation and finalisation code;
  3. Tell the linker that the binary will use the dynamic linker, /lib/ld-linux.so.

For example:

$ cat hello.s
 .text
 .globl main
main:
 push %ebp
 mov %esp, %ebp
 pushl $hw_str
 call puts
 add $4, %esp
 xor %eax, %eax
 leave
 ret

 .data
hw_str:
 .asciz "Hello world!"

$ as -o hello.o hello.s
$ ld -o hello -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o -lc hello.o /usr/lib/crtn.o
$ ./hello
Hello world!
$
Matthew Slattery
  • 45,290
  • 8
  • 103
  • 119
  • thats extremely helpful, that clarifies alot of information. upon applying that to my code, i am getting 2 errors, "undefined reference to '__libc_csu_fini'" and "undefined reference to '__libc_csu_init'" after doing a symbol dump on all of the object files, i failed to find those symbols, and crt1.o seems to call the symbols. is there anything that could possibly have those symbols inside of their object file? – Cyro Aug 27 '10 at 00:17
  • Those come from an unshared portion of the C library; linking with `-lc` should pull in `/usr/lib/libc.so`, which is actually a linker script fragment which references the right file (`/usr/lib/libc_nonshared.a`). Maybe a problem with link order? I'm pretty sure that you want `crt1.o` followed by `crti.o` first, then your objects and libraries, then `crtn.o` right at the end - but maybe `-lc` should come after your objects (just before `crtn.o`), not before. – Matthew Slattery Aug 27 '10 at 00:47
  • I went ahead and simply linked with /usr/lib/libc_nonshared.a right after typing in -lc and the whole thing worked! thanks a million! – Cyro Aug 27 '10 at 00:56
  • 2
    I came here looking for instructions to do the same thing for elf64 and found that the above instructions work providing the reference to ld-linux.so.2 is changed to ld-linux-x86_64.so.2. Thanks! – Adrian G Jul 23 '13 at 15:30
  • 1
    Are the `crt` files required to call glibc functions if you define `_start` in the assembly program? – Ciro Santilli OurBigBook.com Jun 08 '15 at 12:51
  • 1
    @Ciro: no, they aren't. See [this answer for the full details on building static/dynamic executables that use libc from start or main](http://stackoverflow.com/questions/36861903/assembling-32-bit-binaries-on-a-64-bit-system-gnu-toolchain/36901649#36901649). You just have to call the correct glibc init functions in the correct order, like the CRT startup code does. Actually, on Linux, that happens automatically with dynamic linking, so you only need it if you statically link libc. Or you can use a libc implementation like MUSL that doesn't need startup functions to be called. – Peter Cordes Aug 23 '16 at 10:11
7

If you define main in assembly

Matthew's answer does a great job of telling you the minimum requirements.

Let me show you how how to find those paths in your system. Run:

gcc -v hello_world.c |& grep 'collect2' | tr ' ' '\n'

and then pick up the files Matthew mentioned.

gcc -v gives you the exact linker command GCC uses.

collect2 is the internal executable GCC uses as a linker front-end, which has a similar interface to ld.

In Ubuntu 14.04 64-bit (GCC 4.8), I ended up with:

ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
  /usr/lib/x86_64-linux-gnu/crt1.o \
  /usr/lib/x86_64-linux-gnu/crti.o \
  -lc hello_world.o \
  /usr/lib/x86_64-linux-gnu/crtn.o

You might also need -lgcc and -lgcc_s. See also: Do I really need libgcc?

If you define _start in assembly

If I defined the _start, the hello world from glibc worked with just:

ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -lc hello_world.o

I'm not sure if this is robust, i.e. if the crt initializations can be safely skipped to invoke glibc functions. See also: Why does an assembly program only work when linked with crt1.o crti.o and crtn.o?

Community
  • 1
  • 1
Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
1

I think something like this should work:

  1. make a simple C program
  2. gcc -S file.c
  3. edit file.s
  4. gas file.s
  5. ld file.o -lc crt1.o -o myprog
Igor Skochinsky
  • 24,629
  • 2
  • 72
  • 109
1

If you do use _start instead of main (as mentioned in some of the comments above), you'll also need to change the way the program exits, or you'll get a seg fault:

            .text
            .globl    _start
_start:     
            mov       $hw_str, %rdi
            call      puts
            movl      $0,%ebx   # first argument: exit code.
            movl      $1,%eax   # system call number: sys_exit.
            int       $0x80     # call kernel.

            .data
hw_str:     .asciz "Hello world!"

On Kubuntu 18.04.2 (gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0):

$ as -o hello.o hello.s
$ ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o hello hello.o -lc

Also, one easy way to find out what the dynamic linker is on your system is to compile a small C program and then run ldd on the binary:

test.c:

int main() { return 0; }

Compile and run ldd against executable:

$ gcc -o test test.c
$ ldd test
    linux-vdso.so.1 (0x00007ffd0a182000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff24d8e6000)
    /lib64/ld-linux-x86-64.so.2 (0x00007ff24ded9000)
Andy Turfer
  • 181
  • 1
  • 4
  • 2
    If you're using libc stdio functions, you should usually call `exit` by returning from main or `call exit` from `_start`. But if you do make the system call directly, use the 64-bit ABI. `mov $231, %eax` ; `xor %edi,%edi` / `syscall` = sys_exit_group(edi=0). Some people may be using a kernel built without `CONFIG_IA32_EMULATION` where `int $0x80` won't work. (e.g. Windows Subsystem for Linux, or some Gentoo kernels.) – Peter Cordes Feb 27 '19 at 04:16