Where does my allocated memory actually start from when i use brk system call

Question

I am trying to allocate some memory using sys_brk in NASM/x86 assembly. sys_break returns the new address of break, which is the end of the data segment right? So where does my newly allocated memory reside? I assumed that it is in between the old break value and the new break value. So if I allocate 64bytes of memory with sys_brk i can use the next 64 bytes starting from the old break value that i stored before calling sys_brk. Am I right?

My Assembly code that will allocate memory will look somewhat like this.https://gist.github.com/nikAizuddin/f4132721126257ec4345

And another side question is;

I am supposed to write a function in Assembly that returns the pointer to the dynamically allocated memory and that function will be called from a C program. How can i free this block of memory from C side of my program? Would just calling free() be enough?

possible duplicate: http://stackoverflow.com/questions/6988487/what-does-brk-system-call-do. The Linux man page for sbrk / brk is somewhat sparse, hopefully that answer has more info. — Peter Cordes, Oct 01 '15 at 11:38

Peter Cordes · Answer 1 · 2022-10-06T16:01:00.050

The brk(2) man page (section: C library/kernel ABI differences) describes how the glibc wrapper is implemented on top of Linux's system call, which returns the new brk on success, or the old brk on failure.

As I understand it, memory beyond the current break is unmapped. Addresses below the current break are part of the data segment (in the sense of data+bss+heap). The docs aren't clear on whether the break has to be page-aligned. (i.e. can you sbrk(64), or only sbrk(4096)?) If ASLR is enabled, the initial break will be some random distance past the end of the BSS.

See: What does the brk() system call do? An answer on that question has an example of using sbrk to replace malloc for code-golf. So yes, the old break is the address to return. And apparently you can sbrk any increment you want, not just pages.

You're the one writing the memory allocator. sbrk just lets you get more from the OS, like mmap(MAP_ANONYMOUS) but less flexible. It doesn't help you keep track of free blocks so you can use them for future allocations instead of always getting more from the OS.

The way to give back memory you got with sbrk is by calling sbrk with a negative argument. Obviously this requires a last-in-first-out usage pattern, which is why glibc's malloc only uses sbrk for small allocations (that can be put on the free-list when freed, to be handed out for future mallocs). Big allocations are best returned to the OS right away, instead of being kept mapped, so glibc's malloc uses mmap for those.

Never call free(3) on memory you didn't get from malloc(3) (or an associated function, like strdup(3), that says in the docs you can and should free(3) the memeory.) IDK what would happen if you called munmap on a page of memory below the program break. Probably it would just work, but then you'd have a hole in your data segment that could cause problems if the break ever decreased to there.

In assembly, the Linux brk system call takes an address where you want to set the break. As the man page notes, it either returns that for success, or returns the old break on failure, never a -errno code like -ENOMEM.

See Assembly x86 brk() call use for an x86-64 example.

The POSIX API where you can use positive or negative integer offsets is something you can implement by always calling twice, or like glibc keeping track of the current break in a global variable. To init that variable, use brk once with a requested address of 0, which will fail, as shown in the strace output below.

This is similar to what you'd do with the POSIX API, calling sbrk with increment = 0.

This is what glibc's malloc(3) does internally:

$ strace -e brk ls 2>&1 | m
brk(0)                                  = 0x650000
brk(0)                                  = 0x650000
brk(0x671000)                           = 0x671000

The brk man page mentions end(3). Apparently there are globals which are located at the end of the text, data, and bss segments. However, &end is only "somewhere near" the program break, which is why malloc still has to make a system call to get the initial break. IDK why there's a redundant brk(0). These are raw system calls, not library function calls, so an sbrk(0) probably doesn't explain it.

Where does my allocated memory actually start from when i use brk system call

1 Answers1

Linked