6

When std::vector gets full, new memory is allocated. From what I read, the new capacity grows in a geometric progression (but this is irrelevant to the question), then the old information is copied in the new memory region, and the old one is freed.

Based on this assumption, my questions are:

  1. Why don't compilers try to see if there's enough contiguous free-to-use memory at the end of our std::vector, allocate just a portion at the end of our std::vector, and don't waste time copying?

  2. Did people try to implement this, but it was decided it's not worth to do it? (on the average/always)

  3. Are there are any other, more subtle, reasons why this is not happening?

jotik
  • 17,044
  • 13
  • 58
  • 123
Alin Galatan
  • 258
  • 2
  • 9
  • 5
    What makes you think they don't do this where it's possible? (It's only possible for vectors of POD, but if `std::vector` uses `realloc` in that case, then it's doing exactly what you said. The `realloc` function is allowed to grow the block in place.) – David Schwartz Apr 28 '16 at 23:04
  • My question was based on the assumption that this is not happening. For example, in the post http://stackoverflow.com/questions/7899973/what-happens-under-the-hood-of-vectorpush-back-memory-wise someone says "It then copies all the values from the old memory buffer to the new memory buffer." – Alin Galatan Apr 28 '16 at 23:15
  • I come from the "pure math" realm, and I might tend to take every word ("all") ad-literam. – Alin Galatan Apr 28 '16 at 23:16
  • It's easier to call reserve than to write a perfect memory allocator. – user3528438 Apr 28 '16 at 23:21
  • Your question is based on the assumption that this is not happening, and then you're asking why? A pure mathematician should be able to see that this is not a well-formed question. – user207421 Apr 28 '16 at 23:21
  • 1
    There is an implied "acts as if" before all of that. The answer is describing what the effects are, not precisely how they're achieved. If optimizations are possible that produce the same effect, he's not saying their prohibited or not done. – David Schwartz Apr 28 '16 at 23:21
  • *"Why don't compilers try to allocate..."* -- The compiler only compiles the program, it is not involved in any way in the program's execution. The reallocation (if needed) happens at runtime; it is done by the program itself (by some code that already exists in the standard C or C++ library). The correct question is "Why the heap management routines don't try to allocate...". – axiac Apr 28 '16 at 23:24
  • 1
    I am not an expert on `std::vector`, but I *think* your assumption is incorrect (and that the post http://stackoverflow.com/questions/7899973/what-happens-under-the-hood-of-vectorpush-back-memory-wise is mildly misleading). Assuming that `std::vector` is ultimately based on the malloc/free heap, then if the vector decides it needs more space, it is likely to call `realloc` to do so, and of course `realloc` will extend in place if possible. (Mind you, when there's a lot of dynamic allocation and geometric extension going on, growing in place is rarely possible.) – Steve Summit Apr 28 '16 at 23:24
  • 1
    @SteveSummit It can only do that for POD. Otherwise, it must use the copy constructor or move constructor. Consider, for example, an object derived from `shared_from_this` or an object that registers its address in a global container. – David Schwartz Apr 28 '16 at 23:37
  • 2
    @DavidSchwartz Is it the case that `realloc` most be avoided for non-POD data because by the time you discover that `realloc` has moved the block, it's too late? Or is it something else? – Steve Summit Apr 28 '16 at 23:56
  • 2
    @SteveSummit That's correct. – David Schwartz Apr 29 '16 at 00:30
  • 1
    @DavidSchwartz Thanks for the confirmation. Yeah, I've had that problem even in C. Presumably someone, somewhere has implemented a non-copying (and of course wickedly nonstandard) `realloc` variant that does *not* copy and free the old pointer in the case that it has to move, so that you can. – Steve Summit Apr 29 '16 at 00:39
  • 2
    "Presumably someone, somewhere has implemented a `realloc` variant that does not copy and free the old pointer in the case that it has to move". This was asked about in http://stackoverflow.com/questions/13845013/realloc-variants and someone pointed to the proposed `try_realloc` in this draft: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1527.pdf – Steve Summit Apr 29 '16 at 03:47
  • 1
    @DavidSchwartz vectors use the C++ allocator interface, and that does not have any `realloc`-like functionality – M.M Apr 29 '16 at 09:02
  • @M.M: There's nothing stopping `std::vector` from knowing that `std::allocator` has a `_Realloc()` method which can do such reallocations. You couldn't detect that optimization in portable code. – MSalters Apr 29 '16 at 11:41
  • @MSalters `reserve()`, or an insert operation, which would make the vector greater than the currently reserved amount, must use a new allocation and copy/move existing elements. You could detect violation of this by adding output to an element's move-constructor. (Might also be possible to detect by something like saving `&T[0]` as `uintptr_t`) – M.M Apr 29 '16 at 12:31
  • @M.M: I don't think that's an actual requirement. There's a big-O limit but O(1) is a subset of O(N). – MSalters Apr 29 '16 at 14:52
  • @MSalters it's nothing to do with orders... the standard says that when the vector needs more space it must get it from its allocator, and the only function in the allocator to get space is to obtain a new block – M.M Apr 30 '16 at 03:48

1 Answers1

3

It is a combination of your points 2) and 3).

First it was reasoned (I cannot say how much measuring has been done at the time) that the benefits were rare and not so great. You can only (significantly) grow memory iff no allocation happened after the origninal allocation, and the cost of growing a vector is amortized.

However many have pointed out that even this scenario is not so rare, and can significantly improve performance and prevent memory fragmentation. So there was a proposal in 2006.

This is were the more subtle reasons kicked in. The first problem is that Containers do not allocate memory by themselfs, but use an Allocator to do so. So as a first step the allocator interface would need to be changed. This is difficult because, as others have noted, realloc can only be used if the contained type is trivial. To be generally useful we would need a different low level function to grow memory (one that reallocates only if it can be done inplace, see the proposal for details). The problem there is that not all platforms provide such a function, and so we would need Posix or the C standard to provide one first.

Fabio Fracassi
  • 3,791
  • 1
  • 18
  • 17