Does std::vector have to move objects when growing capacity? Or, can allocators "reallocate"?

Question

A different question inspired the following thought:

Does std::vector<T> have to move all the elements when it increases its capacity?

As far as I understand, the standard behaviour is for the underlying allocator to request an entire chunk of the new size, then move all the old elements over, then destroy the old elements and then deallocate the old memory.

This behaviour appears to be the only possible correct solution given the standard allocator interface. But I was wondering, would it make sense to amend the allocator to offer a reallocate(std::size_t) function which would return a pair<pointer, bool> and could map to the underlying realloc()? The advantage of this would be that in the event that the OS can actually just extend the allocated memory, then no moving would have to happen at all. The boolean would indicate whether the memory has moved.

(std::realloc() is maybe not the best choice, because we don't need do copy data if we cannot extend. So in fact we'd rather want something like extend_or_malloc_new(). Edit: Perhaps a is_pod-trait-based specialization would allow us to use the actual realloc, including its bitwise copy. Just not in general.)

It seems like a missed opportunity. Worst case, you could always implement reallocate(size_t n) as return make_pair(allocate(n), true);, so there wouldn't be any penalty.

Is there any problem that makes this feature inappropriate or undesirable for C++?

Perhaps the only container that could take advantage of this is std::vector, but then again that's a fairly useful container.

Update: A little example to clarify. Current resize():

pointer p = alloc.allocate(new_size);

for (size_t i = 0; i != old_size; ++i)
{
  alloc.construct(p + i, T(std::move(buf[i])))
  alloc.destroy(buf[i]);
}
for (size_t i = old_size; i < new_size; ++i)
{
  alloc.construct(p + i, T());
}

alloc.deallocate(buf);
buf = p;

New implementation:

pair<pointer, bool> pp = alloc.reallocate(buf, new_size);

if (pp.second) { /* as before */ }
else           { /* only construct new elements */ }

I don't think it needs a pair, you can simply compare to the pointer that was passed in. As long as reallocate understands proper move semantics I can't think of a problem. — Mooing Duck, Nov 03 '11 at 23:35
@Xeo: sometimes. If it can't, it allocates one of the proper size and does a bitwise copy. Obviously the C++ version wouldn't do a bitwise copy. — Mooing Duck, Nov 03 '11 at 23:35
I just thought about that -- the bitwise copy would be *undesirable* when the memory goes to a new place. So we'd also need a new underlying implementation. A simple modification of `realloc()` should be readily available in any C standard library, though. — Kerrek SB, Nov 03 '11 at 23:36
@Mooing: That bitwise copy is the problem though, C++ objects aren't just simple byte blobs. — Xeo, Nov 03 '11 at 23:37
@KerrekSB: If you're offloading the copy to the vector, then I can't think of any reason for it to be missing. — Mooing Duck, Nov 03 '11 at 23:38
Unless `reallocate` understand move semantics, in the case that the memory can't be grown you've just lost the source from which to copy/move the elements. — K-ballo, Nov 03 '11 at 23:39
@Kerrek: Forget my comment and answer, I forgot how allocators actually work for a moment. :) — Xeo, Nov 03 '11 at 23:39
@KerrekSB Is a bitwise copy really undesirable in case `realloc` moves the memory? It'd create identical copies of the existing objects, as long as you don't invoke the destructors of the objects in the old memory chunk before freeing it, everything would be kosher, wouldn't it? — Praetorian, Nov 03 '11 at 23:45
@MooingDuck: On your first comment: The only possiblity is if the *grow* function of the allocator were to fail in the event of not being able to *grow*, and leave the memory as it was before (no bitwise copy). By the time you get to compare the pointers of `realloc`, the damage is done. — David Rodríguez - dribeas, Nov 03 '11 at 23:45
@David: `grow` is arguably a much better name for the feature! — Kerrek SB, Nov 03 '11 at 23:46
@Praetorian: No, bitwise copy is useless in general (though we can discuss typetrait-based specialisations!) since the copy in the new memory is meaningless garbage in general. — Kerrek SB, Nov 03 '11 at 23:47
@Praetorian: There are different issues with the bitwise copies... consider for example, that there might be internal pointers, for example I have used an implementation of the `NullObject` pattern where the object held a *null-object* and a pointer to the current object that could either refer to a dynamically allocated *real-object* or to the *null-object* member. In the cases where the object is *null*, the pointer references another member of the same object. In that case, a bitwise copy will leave dangling pointers. — David Rodríguez - dribeas, Nov 03 '11 at 23:49
@Praetorian: Imagine objects that have pointers to internal members, or have child node that point at them. If you bitwise copy it somewhere, that child node and pointer are now pointing at invalid memory — Mooing Duck, Nov 03 '11 at 23:49
This is why at least half the time people use `std::vector`, they should be using `std::deque` instead. (By my estimate.) — Nemo, Nov 04 '11 at 00:25
@Nemo: Hehe... just recently I read someone "famous" declare that, too (I forget who and where, though)... quite right, `deque` is an intriguing little container. — Kerrek SB, Nov 04 '11 at 00:26
`std::deque` is one of the most unfortunate containers. It is really good at what it does. And you almost never need what it does. A geometrically growing circular buffer would have been a much better candidate for a std::container than std::deque. The circular buffer has much better performance and far less complexity. But it doesn't guarantee the stability of references like `std::deque` and `std::list` do. But in my experience, the circular buffer solves most push-pop queue problems better than std::deque and when it doesn't, `std::list` is the correct alternative. — Howard Hinnant, Nov 04 '11 at 02:57
Related: [Is it guaranteed that C++ standard library containers call the replacable new functions?](https://stackoverflow.com/questions/46823224/is-it-guaranteed-that-c-standard-library-containers-call-the-replacable-new-fu). If not, `std::vector` could use calloc/realloc/free under the hood. But unfortunately it seems that would only be possible as part of whole-program optimization for programs that don't replace `::operator new`. I don't understand why the `new` API is so dumb, and doesn't even have a `calloc` to save the zeroing of already-zeroed fresh pages from the OS. — Peter Cordes, Oct 19 '17 at 06:50
@PeterCordes: Sure you can replace the global allocation functions, but you only get to choose one of `malloc` or `realloc` -- and if you use `realloc`, there's no room in the interface to *signal* the caller that no moving happened, so the vector implementation doesn't even know that it should be looking out for that. In fact, you'd break `vector` horribly, because it might attempt to copy over itself and then call destructors. I think without a change in the interface you can't get anywhere (even if it's only in the non-code part, such as "the returned pointer may be the same"). — Kerrek SB, Oct 19 '17 at 10:45
@KerrekSB - that's not what Peter is getting at: he's not saying that one should replace `operator new` on a global basis. Rather he's suggesting that `vector` implementations could potentially specialize their allocators to use `realloc` to move trivially copyable objects (potentially expanding the allocation in-place without movement) or `calloc` to initialize zero-initialized objects (although there isn't a trait to detect when a class has an-equivalent-to-zero-initialization constructor, it seems). — BeeOnRope, Oct 19 '17 at 18:42
So what @PeterCordes is pointing out is that one possible issue with the above optimization is that it would change observable behavior for programs with replaced `operator new`: such programs would never see calls into their replaced operator, potentially invaliding the optimization. — BeeOnRope, Oct 19 '17 at 18:44
@BeeOnRope: Good point about no trait for `calloc`. But gcc already supports optimizing a loop that stores all zeros into `memset(0)`, and also supports optimizing malloc+memset(0) into calloc. So this part might happen without source changes other than using malloc (or even just telling the compiler than `malloc` and `new` were compatible.) — Peter Cordes, Oct 19 '17 at 19:13
@BeeOnRope: ugh, compiler support isn't currently as good as I'd thought. gcc needs an explicit `memset` to optimize `malloc` to `calloc`, not an auto-generated one. https://godbolt.org/g/u17eUq clang doesn't do it at all. — Peter Cordes, Oct 19 '17 at 19:29

score 47 · Accepted Answer · edited Jun 24 '13 at 21:38

47

When std::vector<T> runs out of capacity it has to allocate a new block. You have correctly covered the reasons.

IMO it would make sense to augment the allocator interface. Two of us tried to for C++11 and we were unable to gain support for it: [1] [2]

I became convinced that in order to make this work, an additional C-level API would be needed. I failed in gaining support for that as well: [3]

edited Jun 24 '13 at 21:38

wallyk

56,922
16
83
148

answered Nov 03 '11 at 23:49

Howard Hinnant

206,506
52
449
577

Sweet, thanks! I'm sorry to hear you failed to garner further support -- it seems to be to be totally trivial to add a C interface function: Just take `realloc()` and remove the part that copies the memory. It wouldn't even have to become part of the C standard; it could just be something that the C++ library implements... – Kerrek SB Nov 03 '11 at 23:55
2

Custom C++ allocators would benefit from a C `realloc-but-don't-move` function. That was the principle motivation for going to the C committee with this. – Howard Hinnant Nov 03 '11 at 23:59
2

Well, there's always the possibility that *other*, user-defined classes could use an allocator that has such a "may grow" interface. Shame that it got rejected. Is it OK to write an allocator that does *more* than the prescribed allocator interface, though? – Kerrek SB Nov 04 '11 at 00:03
Yes. But `std::containers` won't be smart enough to use your extended interface. But maybe boost::intrusive containers would. The author of that library is the same person that wrote N2045. – Howard Hinnant Nov 04 '11 at 00:05
Can't you just mmap memory? I mean, the vector allocates some memory, and mmaps it, and works on the map. Once you ran out of memory, you allocate more, and mmap it after the previous mmap block. This probably only works on unix, and does increase the size of the vector (you have to store one pointer per allocation to be able to destroy the vector properly) but it essentially means that on Linux, BSD and MacOS vector wouldn't ever need to copy/reallocate its elements. Maybe its just a dumb idea but a magic ring buffer works similar to this on unix without problems. – gnzlbg Mar 07 '16 at 10:44
@gnzlbg: This is exactly what you would do if the allocator had an API to do this, and `vector` was aware of that API. As it is today, `vector` only knows how to ask for a new buffer. `vector` doesn't know how to deal with the case where the allocator might return a bigger buffer that starts at the same address as the old buffer, because it is going to try to move elements from the old buffer to the new. It is certainly possible to write such an "expanding buffer" API, but to-date the committee has chosen not to. – Howard Hinnant Mar 07 '16 at 13:33
I see, thanks Howard. I recall playing a couple of years ago with a vector that basically requested a lot of virtual memory upfront (a couple of GBs), but only committed that memory as it grew, basically eliminating reallocations and making strong exception safety a bit easier. It did not support allocators though. And well if you are using mmap for this, you can basically, inside the vector, have two pointers with two different addresses pointing to the same object. This did not leak through the vector API but it is probably borderline undefined behavior. I found it fun. – gnzlbg Mar 07 '16 at 18:08
1

@gnzlbg - you don't actually need to maintain a pointer per `mmap` to be able to deallocate it. I guess you assumption is that your need to pass the same pointer back to `munmap` that you got from `mmap` (like `new` and `delete`) but it doesn't work that way: you can just pass the whole range of pages you want to delete and they are freed even if they come from different `mmap` calls. It isn't even an error if some of them are not mapped at all! So you just need the pointer to the start of the area (already part of vector) and the capacity (already part of vector)... – BeeOnRope Oct 19 '17 at 06:44
Do you have a link to how your proposal was received by wg14 et al? I found - https://groups.google.com/a/isocpp.org/g/std-proposals/c/fTDoVNUrkJU/m/aysn3o5KDwAJ – Bruce Adams May 09 '22 at 09:32
Also note for other readers `sizeof_alloc()` exists in glibc and is called `malloc_usable_size()`. – Bruce Adams May 09 '22 at 09:34
Your link is as good as it gets. – Howard Hinnant May 10 '22 at 13:52

David Rodríguez - dribeas · Answer 2 · 2011-11-04T00:06:27.367

12

In most cases, realloc will not extend the memory, but rather allocate a separate block and move the contents. That was considered when defining C++ in the first place, and it was decided that the current interface is simpler and not less efficient in the common case.

In real life, there are actually few cases where reallocis able to grow. In any implementation where malloc has different pool sizes, chances are that the new size (remember that vector sizes must grow geometrically) will fall in a different pool. Even in the case of large chunks that are not allocated from any memory pool, it will only be able to grow if the virtual addresses of the larger size are free.

Note that while realloc can sometimes grow the memory without moving, but by the time realloc completes it might have already moved (bitwise move) the memory, and that binary move will cause undefined behavior for all non-POD types. I don't know of any allocator implementation (POSIX, *NIX, Windows) where you can ask the system whether it will be able to grow, but that would fail if it requires moving.

edited Nov 04 '11 at 00:06

answered Nov 03 '11 at 23:36

David Rodríguez - dribeas

204,818
23
294
489

3

I think he meant a new reallocation function, not compatible with malloc.h `realloc`, that doesn't move contents, nor free the old block, when resizing in place isn't possible. – Ben Voigt Nov 03 '11 at 23:42
2

Also, `VirtualAlloc` and `mmap` allow you to request the address immediately following your block, so you'll either extend the block or fail. But I too don't know of any heap-based allocators that support the "grow if possible" operation. – Ben Voigt Nov 03 '11 at 23:44
@Ben: Yes, indeed. I'd need 1) an interface addition for the standard library allocator, and 2) an underlying implementation that grows if possible but does a naked `malloc` if not, without moving. – Kerrek SB Nov 03 '11 at 23:44
1

@BenVoigt: Clearly such a feature must already exist: I could just rip `realloc` apart and kill the part that does the bitwise moving, couldn't I? `realloc++` :-) – Kerrek SB Nov 03 '11 at 23:45
I don't understand this part `Note that while realloc can sometimes grow the memory without moving, by the time it completes it might have already moved (bitwise move) the memory`. Can you please explain that in more detail? – Praetorian Nov 03 '11 at 23:47
@David: I think you make a good point: Such a feature would just not be very widely applicable. The only containers that could use it are `vector` and `string`, and as you say, `vector`'s geometric growth means that the reallocation feature would rarely be more efficient. – Kerrek SB Nov 03 '11 at 23:50
@KerrekSB: The point is not mine... I read it, I think that in *[Design and Evolution of C++](http://www.amazon.com/Design-Evolution-C-Bjarne-Stroustrup/dp/0201543303)* or some *history* article on the language :) I just don't have a reference to provide now. – David Rodríguez - dribeas Nov 03 '11 at 23:52
@Praetorian: The premise is that *bitwise copy* will cause undefined behavior, if not always in some cases. Now, if you call *realloc* it can either *grow* or *move*. If it has *grown* then you are fine (but this is the least probable outcome), if it has *moved*, then your objects are already in an *undefined state* from which you might not be able to recover (there is no *un-realloc* to bitwise move back to the original location to a *defined* state from which to *malloc* and *copy construct*) – David Rodríguez - dribeas Nov 03 '11 at 23:56
Thanks, I understand the sentence now. At first when I read `by the time it completes it might have already moved` I thought you were saying *by the time realloc completes a grow operation without moving, it might've moved the memory* which didn't make any sense :-) – Praetorian Nov 03 '11 at 23:59
2

Linux provides `mremap`, which you can use to extend an existing anonymous memory map (or fail while trying). – Nemo May 06 '14 at 22:16
You're right that `realloc` would be problematic (only usable in a template specialization for trivially-copyable types, although that does cover a lot of real-world `std::vector` objects). An allocator interface like `try_realloc` would make a lot of sense (making it usable for non-trivially-copyable types), and be trivial to implement with a `return nullptr` stub on implementations where a proper implementation isn't ready yet. – Peter Cordes Sep 21 '21 at 14:11
The other showstopper problem is C++'s guarantee that user-replaceable `new` will be called by `std::vector`'s default allocator if it grows at all, and that could be overridden in another compilation unit where this template can't see it. [Is it guaranteed that C++ standard library containers call the replaceable new functions?](https://stackoverflow.com/q/46823224). That's also a problem for optimizing zero-init vectors to use `calloc`, or a new/delete compatible equivalent which again C++ neglected to provide. – Peter Cordes Sep 21 '21 at 14:11

score 0 · Answer 3 · answered Nov 03 '11 at 23:42

Yep, you're right that the standard allocator interface doesn't provide optimizations for memcpy'able types.

It's been possible to determine whether a type can be memcpy'd using boost type traits library (not sure if they provide it out of the box or one would have to build a composite type discriminator based on the boost ones).

Anyway, to take advantage of realloc() one would probably create a new container type that can explicitly take advantage of this optimization. With current standard allocator interface it doesn't seem to be possible.

No, no, I don't care about memcpyability. I'm fully prepared to invoke my allocator's construction and destruction sequence. I just want to give the allocator the chance to be lazy! — Kerrek SB, Nov 03 '11 at 23:43

Does std::vector have to move objects when growing capacity? Or, can allocators "reallocate"?

3 Answers3

Linked

Related

Does std::vector *have* to move objects when growing capacity? Or, can allocators "reallocate"?

3 Answers3

Linked

Related

Does std::vector have to move objects when growing capacity? Or, can allocators "reallocate"?