Why use std::vector instead of realloc?

Question

Here, in this question, it's stated that there is no realloc-like operator or function in c++. If you wish to resize an array, just just std::vector instead. I've even seen Stroustrup saying the same thing here.

I believe it's not hard to implement one. There should be a reason for not implementing one. The answers only say to use std::vector but not why it's not implemented.

What is the reason for not implementing realloc-like operator or function and preferring to use std::vector instead?

`What is the reason for not implementing realloc-like operator or function and preferring to use std::vector instead?`. What realloc does can be achieved by using delete and new operators — kesarling He-Him, Aug 14 '20 at 07:34
`vector` is readily available, full featured, and well tested. Why reinvent the wheel? — Retired Ninja, Aug 14 '20 at 07:34
Does this answer your question? [How do you 'realloc' in C++?](https://stackoverflow.com/questions/3482941/how-do-you-realloc-in-c) — kesarling He-Him, Aug 14 '20 at 07:49
_"I believe it's not hard to implement one."_ — How would you deal with _non-trivially-copyable_ types during reallocation? — Daniel Langr, Aug 14 '20 at 07:50
@StackExchange123 Why to duplicate the functionality of `std::vector` this way? Note that's not that easy as you write. You need to care about exception safety, types can have copy/non-throwing move/throwing-move constructors, where you need to handle these case separately, etc. — Daniel Langr, Aug 14 '20 at 09:01
@DanielLangr Thanks for noting some of the obstacles. As of for why duplicating, just in case the features of `std::vector` is not needed, it's better to avoid its overhead. — StackExchange123, Aug 14 '20 at 09:11
@StackExchange123 Which overhead? A vector is typically implemented by 3 pointers, which takes 24 bytes on a 64bit arch. You would need at least a pointer plus the information about size, which is 16 bytes. Is that additional 8 bytes such a large overhead for you? — Daniel Langr, Aug 14 '20 at 12:58
@d4rk4ng31 -- there is one thing that `realloc` can do that `delete` and `new` can't do. If there is free memory immediately after the block that's being realloc'ed, `realloc` can simply adjust the size of the memory block, without copying any data. `delete` and `new` will always copy the data. — Pete Becker, Aug 14 '20 at 13:26
@d4rk4ng31 -- I don't know what you mean by a memory block being "trivial". Underneath it all, it's just raw memory, and `realloc` can sometimes expand a memory block, while `new` and `delete` never can. It has been suggested a few times to add a `renew` operator that can do the same sort of thing: expand a memory block in place if their's room, and then initializing the objects that occupy the newly-added memory. — Pete Becker, Aug 14 '20 at 15:54

lubgr · Answer 1 · 2020-08-14T07:44:54.253

What is the reason for not implementing realloc-like operator or function and preferring to use std::vector instead?

Save time. Don't chase bugs in your own code for a problem that has long been solved. Idiomatic C++ and readability. Get answers to your questions easily and quickly. Customize the realloc part by an allocator.

I believe it's not hard to implement one

That heavily depends on what you need from the template you intend to write. For a general-purpose std::vector-like one, have a look at the source code (libcxx's 3400 line vector header is here). I bet you will revise you initial assumption on the low complexity of such construct.

score 2 · Answer 2 · answered Aug 14 '20 at 07:34

There's several advantages.

Vector keeps track of its size and capacity, which means you don't have to do this yourself.
Because the current size is part of the vector object itself, you can pass a vector (by reference or by value) without needing an additional size parameter. This is especially useful when returning a vector as the caller doesn't need to receive the size through some side-channel.
When reallocating, vector will add more capacity than is needed to add just the element(s) requested to be added. This sounds wasteful but saves time as fewer reallocations are needed.
Vector manages its own memory; using vector lets you focus on the more interesting parts of your program instead of the details of managing memory, which are relatively uninteresting and tricky to get exactly right.
Vector supports many operations that arrays don't natively support, such as removing elements from the middle and making copies of an entire vector.

score 1 · Answer 3 · answered Aug 14 '20 at 07:39

1

realloc's expectation that there might be sufficient free space after the current allocation just does not fit well with modern allocators and modern programs.

(There's many more allocation going on, many allocation sizes go to a dedicated pool for that size, and the heap is shared between all the threads in a program.)

In most cases, realloc will have to move content to a completely new allocation, just like vector does. But unlike vector<T>, realloc does not know how to move elements of type T, it only knows how to copy plain data.

answered Aug 14 '20 at 07:39

peterchen

40,917
20
104
186

64-bit systems have lots of virtual address-space, and large allocations will often have gotten some fresh pages from the OS (e.g. via `mmap`). Some OSes even have system calls like Linux's [`mremap(MREMAP_MAYMOVE)`](https://man7.org/linux/man-pages/man2/mremap.2.html) which is a page-granularity realloc, avoiding copying even if there isn't room for this mapping to grow at the current location. (Just updating the page tables so the same physical pages are mapped at a new virtual address, with extra space at the end). Not worth the TLB shootdowns for small size, but worth it for a few GiB. – Peter Cordes Sep 21 '21 at 13:12

kesarling He-Him · Answer 4 · 2020-08-14T08:07:24.173

1

Well, as the other answers have explained nicely about the reason for using vectors, I will simply elaborate on why realloc was not implemented. For this, you need to take a look at what realloc actually does. It increases the size of the memory by intelligently using malloc() and free(). You need to understand, that though it seems to simply increase the size, it does not actually increase the size, but instead allocates another block of memory with the required size (That explains the name realloc).

Take a look at the following lines:

int* iarr = (int*)malloc(sizeof(iarr)*5);
iarr = (int*)realloc(6,sizeof(iarr));  //this is completely discouraged
//what you're supposed to do here is:
int* iarr2 = (int*)realloc(iarr,1 + sizeof(iarr));  //copies the old data to new block implicitly
//this not only saves the previous state, but also allows you to check if realloc succeeded

In C++, this can be (if it is must) achieved, by writing:

int* iarr = new int[5];
int* iarr2 = new int[6];
for(int i = 0; i < 5; i++) {
    iarr2[i] = iarr[i];
}
delete[] iarr;

The only use of realloc was to increase the memory capacity; as C arrays did not do that automatically they had to provide a mechanism to do so; which has been implicitly implemented in most of the containers, making the reason for having a realloc in the first place, moot.

edited Aug 14 '20 at 08:07

answered Aug 14 '20 at 07:44

kesarling He-Him

1,944
3
14
39

2

I believe `realloc` expands the allocated memory if possible. if not, then use `free` and `malloc`. – StackExchange123 Aug 14 '20 at 07:51
1

@StackExchange123 But if it cannot be expanded then `realloc()` copies the raw data from the old to the new allocation, which only works for trivially-copyable types. E.g. `realloc()` on an array of `std::string` is undefined behavior unless the allocation can be expanded without a copy. I guess `std::vector` could SFINAE its reallocation on `std::is_trivial` but then it would need to use `malloc()` and `free()` instead of C++ allocators. – cdhowie Aug 14 '20 at 07:53
1

@cdhowie, Also, it is not compatible with new and delete. I think the real question needs to be, `why is realloc not compatible with new and delete?` – kesarling He-Him Aug 14 '20 at 07:55
@cdhowie Why not just copy the bytes in the array? That is, no call for the destructors, nor the constructors, and the content of the array will not be aware that the memory is moved. if the new size is smaller, call the destructor for the last elements, if it's bigger, call the constructor for the new elements. – StackExchange123 Aug 14 '20 at 07:58
1

@StackExchange123, its not that easy – kesarling He-Him Aug 14 '20 at 07:59
1

@StackExchange123 As I said, that only works for trivial types. Copying the raw memory of a `T` to a new location and then using that new location as a `T` is undefined behavior when `T` is not trivial. – cdhowie Aug 14 '20 at 08:00
@cdhowie Do you mean because some other pointers may be pointing to the old array? I believe this is the case with `realloc` and whoever uses it will be aware of this problem. Or is it just undefined with no obvious reason? – StackExchange123 Aug 14 '20 at 08:02
`some other pointers may be pointing to the old array?`, that will be implementation defined. Its not a good idea to have dangling pointers though – kesarling He-Him Aug 14 '20 at 08:03
1

@StackExchange123 It's the case because the standard says it is so. One can easily create a copyable type that, for example, stores a pointer to one of its own members. After having its raw memory copied, the object will have a dangling pointer but _shouldn't_ if it was properly copied. Objects are certainly permitted to take pointers to their own members, and moving them around in memory without their knowledge doesn't work correctly. – cdhowie Aug 14 '20 at 08:04
@StackExchange123 Basically if you're moving a non-trivial object from one location to another you _must_ either copy-construct or move-construct the target from the source. Doing this gives the object a chance to fix up anything that can't be directly copied. – cdhowie Aug 14 '20 at 08:07
Saying `realloc` "was not implemented" is not strictly true. It is part of the `C++` standard in the `C` compatibility libraries and, as such, any standard conforming `C++` compiler must (and does) implement it. The real question is why is it not used? – Galik Aug 14 '20 at 08:07
@cdhowie If I have a pointer to `some_vector[5]`, how would `std::vector` be more helpful than `realloc`? Why do the copy constructor even has to get called? We're not coping data now. we're just expanding an area (and as a consequence, it may be moved) and the old elements should remain untouched. We don't want the elements to know that there was a move if happened. – StackExchange123 Aug 14 '20 at 08:25
Also, the same problem will happen in a c code if there was a pointer pointing to an array and the array is moved, yet, `realloc is still there and can be used but whoever uses it is responsible for the consequences. I get that it's because the standard says so. But the reason is not convincing enough for me. – StackExchange123 Aug 14 '20 at 08:25
@Galik Yes it's a part of it but can't be used with `new` and `delete`. I'm asking why they didn't make one like `realloc` for c++ as `new` and `delete`. – StackExchange123 Aug 14 '20 at 08:26
1

@StackExchange123 As I've mentioned, what... three times now? The problem is that for non-trivial types you __can't just move them around in memory__ and that is __exactly what `realloc()` will do if it is unable to expand the allocation in-place.__ If a new allocation is needed then non-trivial types __absolutely must__ be move-constructed in the new location instead of copying their raw data there. – cdhowie Aug 14 '20 at 08:27
@StackExchange123 One of the reasons `std::vector` is better in this regard is that when a reallocation happens, it will move-construct all of the new elements from the old and then properly destroy the old elements. This guarantees defined and correct behavior. And that's why `realloc()` is not used in C++. It's _incredibly dangerous_ to a naive user, and it doesn't work on arrays allocated with `new[]` anyway. – cdhowie Aug 14 '20 at 08:35
@cdhowie I've already done with this part and I've mentioned so and there is no need to mention it for the third time. I was commenting on the part where you was explaining why it's an undefined behaviour. – StackExchange123 Aug 14 '20 at 08:35
@StackExchange123 This can work in C because _every_ type is "trivial" in C. Of course, user-defined structures may make semantic assumptions that no longer hold, but that's not a problem that the C standard cares about. If you construct a non-trivial object in C++, you must destroy it before freeing the memory. If you use a region of memory as a non-trivial object in C++, it must have been constructed there in the first place. `realloc()` violates both of these rules, end of story. Obviously you can do what you want, but if you tread in the waters of UB you will find many dragons. – cdhowie Aug 14 '20 at 08:38
@cdhowie Ok, so memory can't be moved. Why there is no one that behaves like how `std::vector` would behave? i.e. allocates new memory, coping the elements, destructing the old elements and returning the new array pointer. What if I don't need the extra overhead of `std::vector`? I know it can be done by hand easily if wanted, but the same applies for `realloc` (except that I'm not sure how to expand the memory if possible). It's just for convenience. – StackExchange123 Aug 14 '20 at 08:40
3

@StackExchange123 Probably because the overhead of vector is minimal and its utility far exceeds that overhead. :) There isn't a C++-standard safe version of realloc because people would rather use vector than work with a kind-of-vector-like-thing. – cdhowie Aug 14 '20 at 08:41

Why use std::vector instead of realloc?

4 Answers4