9

Let's suppose we have a method that creates and uses possibly very big vector<foo>s. The maximum number of elements is known to be maxElems.

Standard practice as of C++11 is to my best knowledge:

vector<foo> fooVec;
fooVec.reserve(maxElems);
//... fill fooVec using emplace_back() / push_back()

But what happens if we have a scenario where the number of elements is going to be significantly less in the majority of calls to our method?

Is there any disadvantage to the conservative reserve call other than the excess allocated memory (which supposably can be freed with shrink_to_fit() if necessary)?

mrclng
  • 483
  • 2
  • 14
  • 9
    If you don't know the actual number of elements there, you usually simply don't use `.reserve` at all, and rely on exponential growth of capacity which is a good default. – milleniumbug Jul 26 '17 at 15:47
  • On linux rss memory will not wasted, only virtual will be reserved due to overcommit. – ks1322 Jul 26 '17 at 15:50
  • 2
    It's not **standard** practice, although it's sometimes appropriate. – Pete Becker Jul 26 '17 at 15:54
  • The only disadvantage is the extra allocation, as you surmise. How that affects your program is well explained in the answers. – Toby Speight Jul 27 '17 at 09:37
  • 1
    If the maximum size is a rare event, using a `reserve(typical_size)` might be a compromise that gives you better performance *on average*. Assuming that this is what you need to improve. – Bo Persson Jul 28 '17 at 09:20

5 Answers5

12

Summary

There is likely to be some downside to using a too-large reserve, but how much is depends both on the size and context of your reserve() as well as your specific allocator, operating system and their configuration.

As you are probably aware, on platforms like Windows and Linux, large allocations are generally not allocating any physical memory or page table entries until it is first accessed, so you might imagine large, unused allocations to be "free". Sometimes this is called "reserving" memory without "committing" it, and I'll use those terms here.

Here are some reasons this might not be as free as you'd imagine:

Page Granularity

The lazy commit described above only happens at a page granularity. If you are using (typical) 4096 byte pages, it means that if you usually reserve 4,000 bytes for a vector that will usually contains elements taking up 100 bytes, the lazy commit buys you nothing! At least the whole page of 4096 bytes has to be committed and you don't save physical memory. So it isn't just the ratio between the expected and reserved size that matters, but the absolute size of the reserved size that determines how much waste you'll see.

Keep in mind that many systems are now using "huge pages" transparently, so in some cases the granularity will be on the order of 2 MB or more. In that case you need allocations on the order of 10s or 100s of MB to really take advantage of the lazy allocation strategy.

Worse Allocation Performance

Memory allocators for C++ generally try to allocate large chunks of memory (e.g., via sbrk or mmap on Unix-like platforms) and then efficiently carve that up into the small chunks the application is requesting. Getting these large chunks of memory via say a system call like mmap may be several orders of magnitude slower than the fast path allocation within the allocator which is often only a dozen instructions or so. When you ask for large chunks that you mostly won't use, you defeat that optimization and you'll often be going down the slow path.

As a concrete example, let's say your allocator asks mmap for chunks of 128 KB which it carves up to satisfy allocations. You are allocating about 2K of stuff in a typical vector, but reserve 64K. You'll now pay a mmap call for every other reserve call, but if you just asked for the 2K you ultimately needed, you'd have about 32 times fewer mmap calls.

Dependence on Overcommit Handling

When you ask for a lot of memory and don't use it, you can get into the situation where you've asked for more memory than your system supports (e.g., more than your RAM + swap). Whether this is even allowed depends on your OS and how it is configured, and no matter what you are up for some interesting behavior if you subsequently commit more memory simply by writing it. I means that arbitrary processes may be killed, or you might get unexpected errors on any memory write. What works on one system may fail on another due to different overcommit tunables.

Finally, it makes managing your process a bit harder since the "VM size" metric as reported by monitoring tools won't have much relationship to what your process may ultimately commit.

Worse Locality

Allocating more memory than you need makes it likely that your working set will be more sparsely spread out in the virtual address space. The overall effect is a reduction in locality of reference. For very small allocations (e.g., a few dozen bytes) this may reduce the within-same-cache-line locality, but for larger sizes the main effect is likely to be to spread your data onto more physical pages, increasing TLB pressure. The exact thresholds will depend a lot on details like whether hugepages are enabled.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
1

What you cite as standard C++11 practice is hardly standard, and probably not even good practice.

These days I'd be inclined to discourage the use of reserve, and let your platform (i.e. the C++ standard library optimised to your platform) deal with the reallocation as it sees fit.

That said, calling reserve with an excessive amount may well also effectively be benign due to modern operating systems only giving you the memory if you actually use it (Linux is particularly good at that). But relying on this could cause you trouble if you port to a different operating system, whereas simply omitting reserve is less likely to.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 3
    Some answers that have been posted to SO quite recently suggest that e.g. `reserve()` + `emplace_back()` is significantly faster than `push_back()` or `emplace_back()` without preallocation. Examples: https://stackoverflow.com/a/44088863/3564091 https://stackoverflow.com/a/32200517/3564091 – KjMag Jul 26 '17 at 16:05
  • Indeed, `emplace_back()` can obviate value copies. – Bathsheba Jul 26 '17 at 16:06
  • As I said it is best practice *to my knowledge*... I am referring to things I read such as @Andrews anwer in [this](https://stackoverflow.com/a/32200517/7416115) question (already posted by @KjMag). If this is incorrect or inaccurate it would be great if you could elaborate why! – mrclng Jul 26 '17 at 16:11
0

You have 2 options:

You don't call reserve and let the default implementation of the vector figure out the size, which uses exponential growth.

Or

You call reserve(maxElems) and shrink_to_fit() afterwards.


The first option is less likely to give you a std::bad_alloc (even though modern OS's probably will never throw this if you don't touch the last block of the reserved memory)

The second option is less likely to invoke multiple calls to reserve, the first option will most likely have 2 calls : the reserve and the shrink_to_fit() (which might be a no-op depending on the implementation since it's non-binding) while option 2 might have significant more. Less calls = better performance.

Hatted Rooster
  • 35,759
  • 6
  • 62
  • 122
0

But what happens if we have a scenario where the number of elements is going to be significantly less in the majority of calls to our method?

The allocated memory simply remains unused.

Is there a downside to a significant overestimation in a reserve()?

Yes, at least a potential downside: The memory that was allocated for the vector can not be used for other objects.

This is especially problematic in embedded systems that do not usually have virtual memory, and little physical memory to spare.

Concerning programs running inside an operating system, if the operating system does not "over commit" the memory, then this can still easily cause the virtual memory allocation of the program to reach the limit given to the process.

Even in over committing system, particularly gratuitous overestimation can in theory result in exhaustion of virtual address space. But you need pretty big numbers to achieve that on 64 bit architectures.


Is there any disadvantage to the conservative reserve call other than the excess allocated memory (which supposably can be be freed with shrink_to_fit() if necessary)?

Well, this is slower than initially allocating exactly correct amount of memory, but the difference might be marginal.

eerorika
  • 232,697
  • 12
  • 197
  • 326
0

If you are on linux reserve will call malloc which only allocates virtual memory, but not physical. Physical memory will be used when you actually insert elements to a vector. That's why you can considerably overestimate reserve size.

If you can estimate maximum vector size you can reserve it just once on start to avoid reallocations and no physical memory will be wasted.

ks1322
  • 33,961
  • 14
  • 109
  • 164