Shared Libraries and Executable linking with static C run time on Linux. Does each of them have separate heap like Windows?

Question

I am clear about Window heap allocation and stack of heaps etc. Though being new to Linux, i do not have much clarity how does it work?

On Windows:

At the beginning of a process, the OS creates a default heap called Process heap. The Process heap is used for allocating blocks if no other heap is used.
Language run times also can create separate heaps within a process. (For example, C run time creates a heap of its own.)
Besides these dedicated heaps, the application program or one of the many loaded dynamic-link libraries (DLLs) may create and use separate heaps, called private heaps
These heap sits on top of the operating system's Virtual Memory Manager in all virtual memory systems.
a) C/C++ Run-time (CRT) allocator: Provides malloc() and free() as well as new and delete operators. b) The CRT creates such an extra heap for all its allocations (the handle of this CRT heap is stored internally in the CRT library in a global variable called _crtheap) as part of its initialization. c) CRT creates its own private heap, which resides on top of the Windows heap. d) The Windows heap is a thin layer surrounding the Windows run-time allocator(NTDLL). e) Windows run-time allocator interacts with Virtual Memory Allocator, which reserves and commits pages used by the OS.

Our DLLs and exe link to multithreaded static CRT libraries. Each DLL and exe we create has a its own heap, i.e. _crtheap. The allocations and de-allocations has to happen from respective heap. That a dynamically allocated from DLL, cannot be de-allocated from executable and vice-versa.

Compiling our code in DLL and exe’s using /MD or /MDd to use the multithread-specific and DLL-specific version of the run-time library, will link both DLL and exe to the same C run time library and hence one _crtheap. Allocations are always paired with de-allocations within a single module.

Is this the same behavior on Liunx? What all heaps are there? What about CRT heaps?

score 1 · Answer 1 · edited May 23 '17 at 12:05

I don't really know how to answer this, but I'm going to try to give enough general information that I'll hit on what you need to know.

As a unix person, what I learned about Windows memory management from your question is that you guys use the word "heap" a lot. We don't, except as an informal synonym for "memory area managed by malloc".

There are 2 major dynamic memory allocation primitives to be aware of: brk and mmap. All the other allocation functions including malloc are built on top of those.

brk is the old one. It works by simply adding more memory to the process's virtual memory map after the end of the bss segment. You pass brk a value, and that becomes the process's "break" address - the end of the allocated virtual memory.

malloc can be built on top of brk by calling it with a new, higher, value every time more memory is needed, and maintaining some internal data structure that tracks what's been freed and what's still in use. (Giving memory back to the system on free was not done in the classic implementation.) The internal data structure of some malloc implementation must have been a heap giving the brk segment its nickname: "the heap".

See also: What does brk( ) system call do? (it has pictures!)

It's really unusual for a program to call brk directly. (Or even the thin wrapper sbrk). Every use of brk in a normal program is via malloc. Remember, though you view the C library, including malloc, as some kind of optional extra, we have an OS that is tightly coupled with C, so a program that doesn't use libc for the low-level stuff like memory management is very weird indeed. So most of the time, all of the memory in the brk segment a.k.a. "the heap" is being managed by malloc. However, the reverse isn't true because...

mmap, newer than brk (mmap is from the 90's; brk is from the 70's) offers a lot of options. When you want to map a file into memory, or allocate several discontiguous blocks of memory instead of just adding some space at the end of the original data block, you use mmap. The shared library loader uses mmap to map each library's text and data. Modern malloc implementations use mmap for large requests and brk for small ones. We also have mremap which relocates a mapping to a new virtual address while keeping it at the same physical address, allowing realloc to avoid an expensive copy.

If you look at /proc/$PID/maps on Linux, you'll see a memory region labeled [heap]. That's the brk segment. There's only one of these per process. (I've seen some examples where the maps file showed 2 of them, but they were contiguous and had identical attributes, so really equivalent to a single region. I don't know what causes the double listing.)

With all this background in mind, what would it mean to allocate an "extra heap"? You could request some memory from the system with mmap, giving you a region that is independent from malloc. You could then do your own malloc-like management of that region, handing it out in chunks of various sizes and keeping track of which parts are unused. But your new allocator won't be "the heap". Which doesn't actually mean anything, because the system doesn't know anything about heaps.

Thanks wumbley for the insight. What happens when a shared library which has statically linked with libc is loaded by an executable which also statically links with libc. Will then there will two heaps? If yes memory allocated from one and deallocated from other, will be a problem? — Abhishek Jain, Apr 06 '14 at 04:28
I don't know... static linking is not always well supported, and I think if you've gone so far that you end up with 2 copies of the same library linked into your running code, you have made serious strategic error. — , Apr 06 '14 at 22:13

Shared Libraries and Executable linking with static C run time on Linux. Does each of them have separate heap like Windows?

1 Answers1