10

I understand there are three, not two areas of memory in C++: stack, heap AND the area for static-assigned features. I have two questions

  1. Why is the heap so much slower than the stack? Surely it should just be one extra level of indirection?

  2. Is the area of memory allocated for static "features" (variables, functions, classes) provide faster performance than the heap?

tshepang
  • 12,111
  • 21
  • 91
  • 136
user997112
  • 29,025
  • 43
  • 182
  • 361

6 Answers6

14

A couple of side notes first. The correct terminology is automatic rather than stack, dynamic rather than heap. The other is that with C++11 there are now four rather than three types of memory. C++11 adds thread local memory to the mix.

Automatic memory is fast because it is implemented using the call stack on most machines. All it takes is to adjust the stack pointer by the right amount and voila! memory is allocated. Dynamic memory requires a whole lot more work underneath the hood. The requisite memory might not be attached to the process, and getting that to happen requires going through the OS. Even if memory is available, the dynamic memory management tools still have to find it and mark it as in use.

Static memory is "allocated" as a part of the compilation and linking process. When you define a static variable in some source file, the compiled code contains special instructions for the linker to reserve space for that variable. The compiler also converts your C/C++ code to machine code. The linker combines all of those different chunks of data and code and resolves addresses to form the executable binary image. When you run your program, that binary image is loaded into (virtual) memory. The memory for that static variable exists as soon as the program starts executing.

As far as performance is concerned, it's best not to worry too much about performance ahead of time. While static memory is fast, there are lots and lots of drawbacks. The last thing you want to do is to make all of your data static.

David Hammen
  • 32,454
  • 9
  • 60
  • 108
  • Though the final paragraph may be confusing. Compilation and linking certainly does not allocate memory on the target system. I appreciate that you used quotations to indicate that it's not real _allocation_ per se, but might it be clearer to talk about how the static memory is part of the process memory, along with executable image, that is loaded when the program is itself is read into memory on the target system? – Lightness Races in Orbit Jan 05 '13 at 14:17
  • @David Hammen- regarding the final point, do you have a view on the resultant performance? (also thank you for the terminology advice). – user997112 Jan 05 '13 at 14:31
  • the usual terminology is "stack" and "heap". in the same way that some people try to avoid using those words, one could try to avoid such misunderstandings about the word "object", say, but it would be just as futile: those who are unable to understand that a word can mean different things in different contexts, will just encounter other difficulties -- it merely shifts the problem by a tiny amount, but creating a new larger problem for most everybody ELSE. – Cheers and hth. - Alf Jan 05 '13 at 14:36
  • re "Automatic memory is fast because it is implemented using the call stack in most operating systems." automatic memory is by definition part of the call stack (e.g. the term *stack unwinding* refers to this stack, regardless of implementation). i think you meant the machine's hardware-supported call stack. – Cheers and hth. - Alf Jan 05 '13 at 14:44
  • @LightnessRacesinOrbit - I updated my answer to reflect your comment. Thanks. – David Hammen Jan 05 '13 at 14:54
3

1) Why is the heap so much slower than the stack? Surely it should just be one extra level of indirection?

Because allocating memory on stack means increasing1 the stackpointer sp by N where N is the number of bytes. Changing sp in that way is fast enough. With heap, you do lots of things, finding free slot or requesting OS for memory, for example, are expensive operations.

1. or decreasing, depending on which way the stack grows!

2) Is the area of memory allocated for static "features" (variables, functions, classes) provide faster performance than the heap?

If you mean static variables, then yes, they're faster than heap, as they're statically allocated, and they exist till the end of the program. However, there is no such thing as static class, and memory allocated for functions cannot be compared in this context (the question doesn't make complete sense to me).

Nawaz
  • 353,942
  • 115
  • 666
  • 851
3

When comparing the speed of stack, heap or static allocation areas, there are two different speeds to compare.

First, there is the speed of access. This is comparable for each of the three areas, although local (stack allocated) variables can have a slight edge because they are easier for the compiler to cache in a CPU register. Otherwise, it is essentially just memory access.

Secondly, there is the speed of allocation. This is where the big differences arise.
Statically allocated objects get their memory reserved at program startup (or library load time when they reside in a dynamically loaded library), so their allocation time is immesurably short as far as the program is concerned.
Stack allocated objects are also cheap to allocate, because they take a predictable allocation pattern (last allocated == first deallocated) that the compiler can easily take into account. For example, if an application has multiple threads, it will also have multiple stacks (each stack reserved for one thread), so the threads don't have to compete with each other for access to the stack memory.
Heap allocated objects are the difficult ones, because the heap is used for everything that does not nicely fit within the earlier groups. Also, there is typically only one heap for the entire application, so thread synchronisation is needed to allocate memory from the heap. And, because the allocation/deallocation patterns are fairly random, measures have to be taken to ensure not too much heap memory gets 'lost' due to fragmentation of the heap. This all takes some time that gets paid when doing an allocation or deallocation.

Bart van Ingen Schenau
  • 15,488
  • 4
  • 32
  • 41
2

Although the "heap is slower" statement is too broad to be meaningful, certain aspects associated with heap allocations are definitely slower than the heap. Unlike stack, which cannot get fragmented because you cannot remove things from the middle of the stack, heap is subject to fragmentation. Your program can de-allocate objects in arbitrary order, creating "holes" in the heap. When you request more memory, the allocator must search for a suitable "hole" to fulfill your request.

In addition, stack-based operations are highly optimized by the computer hardware. Finally, objects allocated in the automatic storage (the official name for "stack") have a higher probability of being cached due to proximity to other locals that your program accesses.

As far as static storage goes, it applies only to data members: static functions and static member functions simply re-use the keyword without re-using its meaning. Objects in static storage are allocated only once, so there is no cause for fragmentation.

Once the allocation is done, the speed of access to the objects in the three kinds of storage (static, dynamic, and automatic) goes at pretty much the same speed.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 1
    I don't see how performance is subjective (as in, varying by person). It differs with different implementations, sure, but for a given implementation, it can be measured objectively, and for similar implementations, reasoning and formal models can give objective indicators of performance. –  Jan 05 '13 at 14:14
  • @delnan When someone says "heap is so much slower" he or she means something in particular, for example, the allocation speed, deallocation speed, access speed, and so on. You can measure and compare speeds of each individual aspect. However, one cannot say that "heap is slower than the stack" implying that a program that employs a heap would be measurably slower than a similar program that employs the stack. That is what I meant by the subjectivity of the "heap is slower" statement. – Sergey Kalinichenko Jan 05 '13 at 14:18
  • Ah, you have a point there. That doesn't fall under subjective for me though, that falls under "too broad a statement to be meaningful". –  Jan 05 '13 at 14:20
  • @dasblinkenlight So with the heap, if your data gets fragmented this could increase the hcances of page faults, cache misses and also TLB misses? – user997112 Jan 05 '13 at 14:33
  • @user997112 Fragmentation slows down the allocation, not the access. It generally takes longer to find a suitable chunk of memory when there's lots of de-allocated chunks to choose from. Once allocated, the memory behaves as usual: you would not necessarily get more cache misses (although the probability of getting a "free ride" due to an unrelated nearby access would definitely go down). – Sergey Kalinichenko Jan 05 '13 at 16:38
1

1) It isn't a matter of indirection, it's a matter of memory manegment. Allocating memory on the stack is just about moving the stack pointer up. While allocation on the heap involves looking for a memory segemnt the right size, and taking fragmentation into account.

2) Static memory gets allocated before the main entry point into the program, so there is no real runtime overhead. Speed of access to already allocated memory doesn't really depend on where the memory is allocated.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
-1

One reason why the stack is so much faster than the heap is due to the principle of locality. Data stored in the stack are located at consecutive locations, which means that if one variable is referred, its adjacent information is automatically brought to the cache. Data stored on the heap (except for dynamic arrays) do not have this advantage.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Arani
  • 400
  • 6
  • 21
  • Heap is not faster it is slower! – Alok Save Jan 05 '13 at 14:08
  • @Alok: Read the rest of the answer; it was obviously a typo. – Lightness Races in Orbit Jan 05 '13 at 14:13
  • @LightnessRacesinOrbit: Yes, it is and so I added a comment so OP can correct it. I suspected there was a impending edit queued by the OP. The downvote is not mine! – Alok Save Jan 05 '13 at 14:13
  • @Alok: Aha okay then - my misassumption – Lightness Races in Orbit Jan 05 '13 at 14:15
  • @LightnessRacesinOrbit It was me. I have an objection with the reasoning that it is locality of reference that is responsible for making stack faster. Even if you consider it for a second, the reasoning won't hold as we don't have 100 Byte caches or something. If the cache was so small, probably then a simpler layout will cause a problem as children of a node at a level can be at a far away location, the problem will amplify at lower levels. So i will wait for the corrections and undo the down vote. – axiom Jan 05 '13 at 14:18
  • @axiom: what's the relevance of a "100 byte cache"? – Oliver Charlesworth Jan 05 '13 at 14:19
  • 1
    -1 The rest is also nonsense. The stack is also in RAM and the cache doesn't care whether some memory region is part of "the stack" or of "the heap" or anything else. There's a *possible* difference: That the stack is virtually guaranteed to be in cache most of the time because it's so heavily used, whereas you have more potential to foul up cache behavior with dynamically allocated memory (e.g., most linked lists). But anyone familiar with low-level optimization will tell you that it's perfectly possible and viable to lay out heap-allocated data so that it works well with the cache. –  Jan 05 '13 at 14:20
  • @delnan I have never mentioned that the heap is not in RAM. All I have said is that accessing data on the stack is likely to bring other useful data into the cache, which is unlikely in a heap. Where's the objection to this point? – Arani Jan 05 '13 at 15:39
  • @Arani More or less, but even with the "(except for dynamic arrays)" bit the whole answer reads a lot like "stack is inherently faster than heap due to cache", which is just nonsense. There's a plethora of data structures specifically optimized for cache utilization, with a whole branch of research dedicated to those data structures (check out cache-oblivious data structures), and whole design paradigms partially aimed at getting the most out of the cache (check out data-oriented design). And no, none of these are limited to plain old arrays, though they're obviously a useful component. –  Jan 05 '13 at 18:35