-5

I'm a computer science student, and this has been puzzling me. As I understand it, all memory used by a process is tracked by either the stack or the heap. I understand the stack very well, right down to the assembly level; I feel that I know how it works and why it works the way it does. The heap, however, I know much less about.

I understand how allocations and deallocations happen on the stack, so I understand their cost. I do not understand how allocations and deallocations happen on the heap. I know heap allocations are slower than stack allocations because I have been told so, and because it makes sense that the heap must be more complicated, but I do not know exactly why. Why is the heap called the heap? Is it a heap of addresses of free blocks of memory?

Take C. In C, you interact with the heap mainly through malloc (and calloc, etc) and free. I understand the effect of calling these methods, but I have no idea what they actually do internally; the heap is a black box to me, so I don't intuitively understand the cost of my actions.

I understand the possibility that the implementation of malloc (for example) might vary depending on any number of things, but I find it difficult to believe it could vary too wildly. The fundamentals of the heap have to be constant across most implementations, otherwise it wouldn't have such a specific name. Any pointers (ha)?

Edit: This is not a duplicate of an earlier thread. I've read the thread in question and it does not answer my questions.

"The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system)."

I have said that I know this in my question; I understand how to get memory from the heap, I want to know what the heap itself is doing to get that memory (beyond "it requests it". Does this mean that the heap is tracked entirely by the OS? If so, what's a basic outline of how the OS tracks it? I understand it varies, but the fundamentals have to be the same).

"The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation."

My entire question is about what that complex bookkeeping actually is.

DJV
  • 3
  • 1
  • 3
    Possible duplicate of [What and where are the stack and heap?](http://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap), which is the number 1 Google search result on *What is the heap in memory?*. Clearly you put a lot of effort into researching this before posting here. – Ken White May 14 '16 at 01:49
  • You can force yourself learn it by programming in C and making your own `malloc`. – Niklas Rosencrantz May 14 '16 at 01:53
  • 2
    Your edit changes nothing. What *the heap itself is doing* is nothing; the heap is an area of memory. It doesn't move, it's not a living creature, and it contains no code. The memory manager handles the heap. – Ken White May 14 '16 at 02:01
  • @KenWhite Neither the heap nor the stack is just an area of memory; they are both systems of bookkeeping for areas of memory. The stack is a living creature, it consists of the code to move the stack pointer around. The heap also has to be a living creature to find things to return to malloc. How does it accomplish this task? – DJV May 14 '16 at 02:03
  • 1
    No. The stack and heap are both areas of memory that are allocated and managed by a **memory manager** in conjunction with the OS. They take care of the bookkeeping, and take care of allocating *things to return to malloc* - those *things* are blocks of memory. You really need to ask your instructor for additional help, because your understanding of memory isn't as complete as you seem to believe. And absolutely **nothing** in a computer is **living and breathing**. You may need help in your biology classes as well. Considered hiring some tutors? :-) – Ken White May 14 '16 at 02:08
  • 2
    Start with [this](https://en.wikipedia.org/wiki/C_dynamic_memory_allocation) and [this](https://en.wikipedia.org/wiki/Memory_management). Then read about [buddy allocation](https://en.wikipedia.org/wiki/Buddy_memory_allocation) since that article takes a mid-level look at the internals of one memory management scheme. Also note that the bookkeeping for the stack is much simpler than the bookkeeping for the heap, because the stack does not suffer from [fragmentation](https://en.wikipedia.org/wiki/Fragmentation_(computing)). In summary, how the heap works is too broad for a SO question. – user3386109 May 14 '16 at 02:16
  • @KenWhite I think we are in a semantic argument. If you ask me to request 50 bytes of stack memory, I can do it at the assembly level - move the stack pointer up 50 bytes. If you ask me to request 50 bytes from the heap, I couldn't tell you where they come from. It's my understanding that every process has a virtual memory space, some of which is allocated for the stack and some of which is allocated for the heap. The exact layout of these things is down to the OS. I know what any given OS has to do for the stack. What should one do for the heap? – DJV May 14 '16 at 02:16
  • 2
    Another quick Google search (*heap management*) turns up [this 1993 MS article](https://msdn.microsoft.com/en-us/library/ms810603.aspx), complete with Windows NT screen captures. It seems your post is intended to get us to do your basic Google searches for you, no? – Ken White May 14 '16 at 02:29

2 Answers2

1

To start off with, stack and heap are just read/write memory. There is no distinction between the two. A stack is simply a stack because it is managed as a stack. A heap is simply a heap because it is managed as a heap. The memory is allocated from the operating system in the same way.

One difference, however, is that a stack must be contiguous in memory. A heap does not [necessarily][ need to be continuous.

Because of the ordering of allocations and deallocations and the contiguous memory in a stack, allocations are single instruction.

 SUB #NUMBEROFBYTESYOUWANT, SP

Why is the heap called the heap?

There's probably a story on that but that's like asking why a cat is called a cat.

Is it a heap of addresses of free blocks of memory?

No. Memory becomes a heap when it is controlled by a heap manager. The problem with your question is that heaps can be managed in different ways. You find scores of malloc/free implementations on the internet that allocate memory in different ways.

I understand the effect of calling these methods, but I have no idea what they actually do internally; the heap is a black box to me, so I don't intuitively understand the cost of my actions.

The simplest would be a malloc that always returns a block of a fixed size. Let's say 1K bytes. The heap manager just maintains a list of free blocks and malloc just picks one off the list. If you do malloc with something greater than 1024, it fails. If there are no free blocks in the list, the heap manager calls an operating system service to map more memory to the process. The heap manager then puts that block of memory under management. When the application calls free, the block is put back on the list of those availble.

That's a simple example. If you request 5 bytes, you get a block that is 1028 bytes underneath.

A malloc implementation could:

  • Manage blocks that are powers of 2 bytes in size and maintain a separate free block list for each size of block.

  • Manage the entire heap as a pool of memory that can be chopped up on demand.

(And there are hundreds of other ways it could and has been done).

"The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system)."

The operating system allocates memory in pages (usually around 512 to 4K bytes, depending upon the system). The purpose of malloc (or its equivalent in other languages) is to allocate memory in other sizes.

In regard to other points you made:

That may be true on some systems but the first half is not always correct. The heap can be initialized upon the first call to malloc.

Does this mean that the heap is tracked entirely by the OS?

The operating system knows nothing about the heap. It just knows that it has allocated memory to the process. The process can do whatever it want to with the heap.

My entire question is about what that complex bookkeeping actually is.

Again that depends entirely upon the heap manager that you link to the application. If there is a way to do such "bookkeeping" it has already been done.

user3344003
  • 20,574
  • 3
  • 26
  • 62
-1

The heap is just another abstract data type in RAM that malloc splits into chunks and delivers pointers to. It is not physically different from the stack.

http://compgroups.net/comp.lang.asm.x86/heap/59347

In x86 and most other systems, both the stack and the heap are just memory in RAM and hopefully primarily from cache.

Niklas Rosencrantz
  • 25,640
  • 75
  • 229
  • 424