Is it safe to "dissolve" c++ arrays on the heap?

Question

I am currently implementing my own vector container and I encountered a pretty interesting Issue(At leas for me). It may be a stupid question but idk.

My vector uses an heap array of pointers to heap allocated objects of unknown type (T**). I did this because I wanted the pointers and references to individual elements to stay same, even after resizing.

This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too. (Heap allocation is slower than on the stack, right?)

T** arr = new *T[size]{nullptr};

and then for each element

arr[i] = new T{data};

Now I wonder if it would be safe, beneficial(faster) and possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one.Then use (and delete) these objects later as if they were allocated separately.

=> Is allocating arrays on the heap faster than allocating each object individually?

=> Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)

Link to my github repo: https://github.com/LinuxGameGeek/personal/tree/main/c%2B%2B/vector

Thanks for your help :)

What do you mean by "unknown type T"? Do you mean that it is a template type parameter? — eerorika, Dec 16 '20 at 11:27
I'm really not sure what you mean by 'forgetting'. But if I understand your scheme correctly, doesn't allocating the elements themselves in an array (the second array) contradict your earlier requirement that 'pointers and references to individual elements to stay same, even after resizing'. If you resize then you are going to have to reallocate the second array, right? — john, Dec 16 '20 at 11:27
Sorry for the inconvenience, yeah unknown type = template type parameter :) If I resize I would create a new array of pointers with the new size and copy the pointers from the first one to the second one + delete excessive object or create new ones. :) — DominikRzecki, Dec 16 '20 at 11:28
What you are trying to come is a use of placement new allocation for a deque-like container. It's a viable optimization, but usually its done to reduce allocation calls and memory fragmentation, e.g. on some RT or embedded systems. The array maybe even a static array in that case. But if you also require that instances of T would occupy adjacent space, that's a contradicting requirement, resorting them would kill any performance gains. — Swift - Friday Pie, Dec 16 '20 at 11:36
Note that if you use placement new you should not use `delete` on created objects, you have to call destructor directly. placement new overload is not a true `new` as far as `delete` concerned. You may or may not cause error but you certainly will cause an crash if you used static array and you will cause heap corruption when deleting element that got same address as dynamically allocated array beginning. — Swift - Friday Pie, Dec 16 '20 at 11:39
`std::vector` + memory pool is pretty much unbeatable. Just use that instead. — nada, Dec 16 '20 at 12:02
@nada It doesn't give you the stable references to elements that OP wants. — eerorika, Dec 16 '20 at 12:04

score 5 · Answer 1 · edited Dec 16 '20 at 12:20

First a remark, you should not think of the comparison heap/stack in terms of efficiency, but on object lifetime:

automatic arrays (what you call on stack) end their life at the end of the block where they are defined
dynamic arrays (whay you call on heap) exists until they are explicitly deleted

Now it is always more efficient to allocate a bunch of objects in an array than to allocate them separately. You save a number of internal calls and various data structure to maintain the heap. Simply you can only deallocate the array and not the individual objects.

Finally, except for trivially copyable objects, only the compiler and not the programmer knows about the exact allocation detail. For example (and for common implementations) an automatic string (so on stack) contains a pointer to a dynamic char array (so on heap)...

Said differently, unless you plan to only use you container for POD or trivially copyable objects, do not expect to handle all the allocation and deallocation yourself: non trivial objects have internal allocations.

eerorika · Answer 2 · 2020-12-16T12:14:16.687

Heap allocation is slower than on the stack, right?

Yes. Dynamic allocation has a cost.

Is allocating arrays on the heap faster than allocating each object individually?

Yes. Multiple allocations have that cost multiplied.

I wonder if it would be ... possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one

It would be possible, but not trivial. Think hard how you would implement element erasure. And then think about how you would implement other features such as random access correctly into the container with arrays that contain indices from which elements have been erased.

... safe

It can be implemented safely.

... beneficial(faster)

Of course, reducing allocations from N to 1 would be beneficial by itself. But it comes at the cost of some scheme to implement the erasure. Whether this cost is greater than the benefit of reduced allocations depends on many things such as how the container is used.

Is it safe to allocate objects in an array and forgetting about the array later?

"Forgetting" about an allocation seems like a way to say "memory leak".

You could achieve similar advantages with a custom "pool" allocator. Implementing support for custom allocators to your container might be more generally useful.

P.S. Boost already has a "ptr_vector" container that supports custom allocators. No need to reinvent the wheel.

score 1 · Answer 3 · answered Dec 16 '20 at 11:44

1

I did this because I wanted the pointers and references to individual elements to stay same, even after resizing.

You should just use std::vector::reserve to prevent reallocation of vector data when it is resized.

Vector is quite primitive, but is is highly optimized. It will be extremely hard for you to beat it with your code. Just inspect its API and try its all functionalities. To create something better advanced knowledge of template programing is required (which apparently you do not have yet).

answered Dec 16 '20 at 11:44

Marek R

32,568
6
55
140

1

Note that this is generally applicable only if no elements are added nor removed nor re-ordered after the reference to an element taken (except push / pop from back). – eerorika Dec 16 '20 at 11:47
1

if someone often adds/removes items not at back of vector then most probably should use something else then vector. – Marek R Dec 16 '20 at 11:53
1

Perhaps that's why OP is writing their own container. – eerorika Dec 16 '20 at 11:55
It is possible, I've posted my answer to correct some incorrect OPs assumptions. Other things looks like covered by other answer(s). – Marek R Dec 16 '20 at 12:04
"primitive" and "optimized" are not contradictory, no need for a "but". If one were to use "advanced template programming" - the result would likely be something that doesn't behave like a vector. It may be better for some things, but it will be worse for others. – einpoklum Dec 16 '20 at 19:18

Swift - Friday Pie · Answer 4 · 2020-12-16T11:56:24.537

What you are trying to come up with is a use of placement new allocation for a deque-like container. It's a viable optimization, but usually its done to reduce allocation calls and memory fragmentation, e.g. on some RT or embedded systems. The array maybe even a static array in that case. But if you also require that instances of T would occupy adjacent space, that's a contradicting requirement, resorting them would kill any performance gains.

... beneficial(faster)

Depends on T. E.g. there is no point to do that to something like strings or shared pointers. Or anything that actually allocates resources elsewhere, unless T allows to change that behaviour too.

I wonder if it would be ... possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one

Yes it is possible, even with standard ISO containers, thanks to allocators. There is concern of thread safety or awareness if this "array" appears to be shared resource between multiple writer and reader threads. You might want to implement thread-local storages instead of using shared one and implement semaphores for crossover cases.

Usual application for that is to allocate not on heap but in statically allocated array, predetermined. Or in array that was allocated once at start of program.

Note that if you use placement new you should not use delete on created objects, you have to call destructor directly. placement new overload is not a true new as far as delete concerned. You may or may not cause error but you certainly will cause an crash if you used static array and you will cause heap corruption when deleting element that got same address as dynamically allocated array beginning

Brandon · Answer 5 · 2020-12-16T11:41:42.953

This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too.

Copying a POD is extremely cheap. If you research perfect forwarding you can achieve the zero cost abstraction for constructors and the emplace_back() function. When copying, use std::copy() as it is very fast.

Is allocating arrays on the heap faster than allocating each object individually?

Each allocation requires you to ask the operating system for memory. Unless you are asking for a particularly large amount of memory you can assume each request will be a constant amount of time. Instead of asking for a parking space 10 times, ask for 10 parking spaces.

Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)

Depends what you mean by safe. If you can't answer this question on your own, then you must cleanup the memory and not leak under any circumstance.

An example of a time you might ignore cleaning up memory is when you know the program is going to end and cleaning up memory just to exit is kinda pointless. Still, you should clean it up. Read Serge Ballesta answer for more information about lifetime.

`Each allocation requires you to ask the operating system for memory.` not quite true. Heap functionality is more complex and system calls are minimized. Note that system can provide memory in quite large chunks called pages. — Marek R, Dec 16 '20 at 11:48
`Each allocation requires you to ask the operating system for memory.` <- Only if your allocator is extremely simplistic, and even then, as @MarekR notes, not necessarily. Typically, the code behind `new` will go to some lengths to reduce the frequency of system calls. — einpoklum, Dec 16 '20 at 19:20

Is it safe to "dissolve" c++ arrays on the heap?

5 Answers5