0

I'm trying to build a toy language compiler (that generates assembly for NASM) and so far so good, but I got really stuck in the topic of dynamic memory allocation. It's the only part on assembly that's stopping me from starting my implementation. The goal is just to learn how things work at the low level.

Is there a good and comprehensive guide/tutorial/book about how to dynamically allocate, use and free memory using Assembly (preferably x64/Linux)? I have found some tips here and there mentioning brk, sbrk and mmap, but I don't know how to use them and I feel that there is more to it than just checking the arguments and the return value of these syscalls. How do they work exactly?

For example, in this post, it is mentioned that sbrk moves the border of the data segment. Can I know where the border is initially/after calling sbrk? Can I just use my initial data segment for the first dynamic allocations (and how)?

This other post explains how free works in C, but it does not explain how C actually gives the memory back to the OS. I have also started to read some books on assembly, but somehow they seem to ignore this topic (perhaps because it's OS specific).

Are there some working assembly code examples? I really couldn't find enough information.

I know one way is to use glibc's malloc, but I wanted to know how it is done from assembly. How do compiled languages or even LLVM do it? Do they just use C's malloc?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Xito Dev
  • 89
  • 7
  • 4
    You can use all the C runtime, including `malloc`. The syscalls can be a bit more complex but not too much. These are *functions* at their base. The way they are implemented should not concern you :) If you know how to work with pointers, you are ready to go. – Margaret Bloom May 19 '20 at 09:31
  • Using `malloc` is generally a good choice. The other options (`mmap`, `sbrk`) come with restrictions and complications you generally want to avoid. – fuz May 19 '20 at 10:48
  • Look at C compiler output to see how they compile code that calls `malloc` and `free`. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) shows how to usefully look at asm output. If your language does its own memory management, that's harder to see simple examples of; even C++ `std::vector` constructor / destructors can compile to a bunch of asm that's hard to wade through, and a language like Java or Go with garbage collection of objects is much more complex than just calling `free` at certain points. – Peter Cordes May 19 '20 at 18:09

1 Answers1

0

malloc is inteface provided for userspace programs. It may have different implementations, such as ptmalloc, tcmalloc and jemalloc. Depending on different environment, you can choosing different allocators to use and even implement your own allocator. As I know, jemalloc manages memory for userspace programs by mmap a block of demanded memory, and jemalloc controls when the block of memory frees to kernel/system.(I know jemalloc is used in Android.) Also jemalloc also uses sbrk depending on different states od system memory. For more detailed info, I think you have to read the codes of defferent allocators you wanted to learn.

persuez
  • 110
  • 6