-1

Due to optimization reasons I want my vector to exist in the cache, and having it on the stack greatly increases the odds of that. Is it possible to create a vector on the stack? I'm completely willing to re-implement my own variant of std::vector if necessary to make it work out. I'm also comfortable using inline assembly if necessary.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • 7
    If it's on the stack, then how could it resize? It would need a fixed size that is known at compile time. Perhaps you're looking for `std::array`. – Blaze May 03 '19 at 13:07
  • 6
    About _"having [the vector] on the stack greatly increases the odds of [its content being in the cache]."_ This is speculation. Cache can load both data from the stack and heap. In fact, there is not even a strong correlation between where the memory is from a software point of view (automatic storage versus free store) and where it is physically located (cache, RAM, disk, etc.). – YSC May 03 '19 at 13:11
  • 5
    The CPU cache does not care whether or not something is on the stack. – Nikos C. May 03 '19 at 13:13
  • 2
    Just use a vector. It will be loaded into the cache. That's why it is the container you should think to use first. In a lot of cases it will be faster than other containers that have a better algorithmic guarantees since they all suffer from cache misses where traversing a vector does not. – NathanOliver May 03 '19 at 13:16
  • Have you measured how big the overhead of malloc/new is? In most cases that overhead is not the real problem but the cache friendliness is. So you have to write your algorithms in such a way that the operands properly fit into cache lines or at least into the L1/L2 caches. – user6556709 May 03 '19 at 13:17
  • You can always `reserve` space for a vector that is a multiple of the cache size. The allocator should allocate reasonably aligned memory as a result. Because of the way modern CPU caches work it shouldn't cause any major issues. – Mgetz May 03 '19 at 13:21
  • I don't believe the assertion that being on the stack makes it more likely to be in the cache. Is suppose that since you have other variables on the stack if they are used first you are more likely to not have a cache miss on first accesses. – Martin York May 03 '19 at 17:12

3 Answers3

4

Having [the vector] on the stack greatly increases the odds of [its content being in the cache]."

This is speculation. Cache can load both data from the stack and heap. Cache doesn't care where the original physical memory comes from. This is the purpose of the Cache. All praise the Cache. In fact, there is not even a strong correlation between where the memory is from a software point of view (automatic storage versus free store) and where it is physically located (cache, RAM, disk, etc.).

Due to optimization reasons I want [...]

You want a faster software I guess. To do that, you need to establish what exactly is making it too slow to your taste. There are tools for that, a profiler is one of them. When you've taken down all your bottlenecks and are left with a software that is too slow still, you can be pretty sure you face a data-oriented performance issue. This is when you ask yourself: how can I handle memory in a predictable way for my CPU cache and CPU branch prediction to help me?

YSC
  • 38,212
  • 9
  • 96
  • 149
  • 3
    I cannot emphasize how important profiling is, it's literally the only way to be sure of a performance issue. – Mgetz May 03 '19 at 13:27
1

CPU cache doesn't know or care if you use stack or heap memory, it works with raw memory addresses, subdivided into cache lines (e.g. 64 bytes). In addition the virtual memory subsystem operates on pages (e.g. 4KB), which can also be a source of slowdown. What is important thus is to stay at or around the same memory location, i.e. by re-using memory.

Yes stack memory is often a safe bet when it comes to caching, since the top of the stack is usually "hot", meaning already cached. But stack is not meant for storage of large or dynamically-sized objects, as doing so would move the top out of the hot zone and defeat the purpose.

std::vector can be cache-friendly. Just make sure to reserve memory beforehand, this way you avoid costly reallocations and data movement, and use a cache-friendly memory allocator like jemalloc (built-in in BSD) or ptmalloc (built-in in Linux). And of course, profile, profile, profile.

rustyx
  • 80,671
  • 25
  • 200
  • 267
0

If you want your vector to live on the stack you don't need to re-implement the vector, but "merely" supply a new allocator that uses the stack instead of heap.

You can look at this to get an idea.

Surt
  • 15,501
  • 3
  • 23
  • 39