1

I'm trying to alocate memory on the heap without using libc and using linux system calls. I've tried using mmap and brk but brk doesn't return the end of heap like I've read it does for most systems, sbrk won't work because it doesn't exist as a syscall, and mmap just causes a segfault.

_start.c

#define PROT_READ 0x1
#define PROT_WRITE 0x2
#define MAP_PRIVATE 0x2
#define MAP_ANONYMOUS 0x20

extern void *mmap(void *addr, unsigned long sz, int prot, int mode, int fd, unsigned long offset);
extern void  exit(int exit_code);

int _start()
{
    void *mem = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

    *(int*)mem = 4;

    exit(*(int*)mem);
}

The reason I am trying to do this is because I am working on a replacement libc (obviously not a competent one if I don't know how to do this, it's mainly a learning exercise/fun project) and I need to figure out how to actually allocate on the heap. I've looked for a while but I still have no clue how it works.

syscalls.s

    .text
    .global mmap
mmap:
    mov $9, %rax
    syscall
    ret

    .global exit
exit:
    mov $60, %rax
    syscall
    ret

The compile command I'm using is gcc -nostdlib _start.c syscalls.s.

Like I said, I am running Linux. Specifically: Ubuntu 20.04 LTS with kernel 5.11.0-43-generic.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • That's a cool project. Have you looked inside glibc itself? –  Jan 05 '22 at 04:51
  • BTW, `exit()` and `_exit()` in modern Linux libc uses the `exit_group` system call, `mov $__NR_exit_group, %eax`. [Syscall implementation of exit()](https://stackoverflow.com/q/46903180). `_exit()` used to `__NR_exit`, but these days only `pthread_exit()` does that. – Peter Cordes Jan 05 '22 at 10:12
  • Also, even if you don't use libc headers, you can still `#include` headers from the kernel, like `asm/unistd.h` to define `__NR_exit_group` and so on. (Yes, you can do this in `.S` files; GCC will run them through the CPP for you, and those asm/ headers don't contain any C declarations, *just* CPP macros.) – Peter Cordes Jan 05 '22 at 10:31

1 Answers1

2

Well, a great opportunity to use strace and use the debugger. From man 2 syscall:

   Arch/ABI      arg1  arg2  arg3  arg4  arg5  arg6  arg7  Notes
   ──────────────────────────────────────────────────────────────
   x86-64        rdi   rsi   rdx   r10   r8    r9    -

gdb a.out:

(gdb) b syscalls.s:5
Breakpoint 1 at 0x1050: file syscalls.s, line 5.
(gdb) r
Starting program: /dev/shm/.1000.home.tmp.dir/a.out 

Breakpoint 1, mmap () at syscalls.s:5
5           syscall
(gdb) info registers
rax            0x9                 9
rbx            0x0                 0
rcx            0x22                34                # here it is
rdx            0x3                 3
rsi            0x1000              4096
rdi            0x0                 0
rbp            0x7fffffffd968      0x7fffffffd968
rsp            0x7fffffffd950      0x7fffffffd950
r8             0xffffffffffffffff  -1
r9             0x0                 0
r10            0x555555554000      93824992231424     # WRONG!
r11            0x206               518
r12            0x555555555000      93824992235520
r13            0x7fffffffd970      140737488345456
r14            0x0                 0
r15            0x0                 0
rip            0x555555555050      0x555555555050 <mmap+7>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0

We see the value in r10 is some garbage, while 0x22 is in rcx. Consult https://uclibc.org/docs/psABI-x86_64.pdf .

You have to do https://github.com/numactl/numactl/blob/master/syscall.c#L160 some rotating. Just mov %rcx, %r10 is enough.

Overall, use https://github.com/lattera/glibc/blob/master/sysdeps/unix/sysv/linux/x86_64/syscall.S#L29 .

KamilCuk
  • 120,984
  • 8
  • 59
  • 111