8

This is a question about the interaction of stack memory and heap memory and the particular case of going from stack to heap via the std::array and std::vector classes.

In principle std::array<T> can be seen as a pointer to the first elements, plus some compile time information about the size of the array. Would it be possible to have std::vector<T> constructor that takes into account that fact and tries to move contents of the array into the vector just by copying the pointer.

A use case would be, that one has a function that returns a std::array<double, >

std::array<double, 20> fun(){...};

but one later decides to assign it to a std::vector without the necessity of copying element by element.

std::vector<double> v = fun(); // not working code

Right now one has to do

std::array<double, 20> tmp = fun();
std::vector<double> v(tmp.begin(), tmp.end());

Which actually does some redundant work which wouldn't be necessary if this were possible std::vector<double> v(std::move(tmp)); \\ not working code.

The memory layout of std::vector and std::array is the same, so that is not and obstacle.

I understand that the main obstacle could be that std::array elements are in the stack while std::vector elements are in the heap. It is clear that even if one writes the move constructor for std::vector still the memory from the stack will be irrevocably destructed.

So I guess that this question can also be read as:

Is there a way to move memory from the stack to the heap (whatever that means) and if that can be combined with a move constructor?

Or if std::vector can have in principle a move constructor from a std::array?

MWE:

#include<array>
#include<vector>

std::array<double, 20> fun(){return {};} // don't change this function

int main(){
    std::array<double, 20> arr = fun(); // ok
    std::vector<double> v(arr.begin(), arr.end()); // ok, but copies and the allocation is duplicated
    std::vector<double> v2 = fun(); // not working, but the idea is that the work is not duplicated
}
alfC
  • 14,261
  • 4
  • 67
  • 118
  • What if you _move_ the array into a vector and then _push_back()_ a new element? – Antonio Pérez Dec 01 '15 at 11:29
  • @AntonioPérez, good point, but I guess `std::vector` will behave as it usually does and reallocate (reserve) more memory (possible somewhere else in memory). The optimization will be lost but that is not the case of the question. – alfC Dec 01 '15 at 11:31
  • Maybe it would be possible in a limited way for `vector` classes using the "small vector optimization". http://www.boost.org/doc/libs/1_60_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.small_vector . AFAIK that still excludes `std::vector` which doesn't use this optimzation (but `std::basic_string` does). Maybe also a certain specialization of `std::vector` with a particular (fake) allocator like in one of the comments below. – alfC Jan 14 '16 at 02:16

2 Answers2

8

It seems like you want to tell std::vector to use the std::array data as its underlying buffer, at least until it needs to do some reallocation.

std::vector does not have an interface for this. It is supposed to manage its internal buffer itself, so memory is allocated, tracked and deleted in a unified manner. If you could provide a buffer to use, you would also need to supply information about how it was allocated, if it might get destroyed on leaving a scope, etc. This is error prone and ugly, so is not available.

What you can do is construct the std::vector with std::move_iterators to move the contents out of the std::array. Of course, this won't make a difference for arithmetic types, but for logically large objects which are cheap to move it can avoid a lot of data copying:

std::array<BigThing, 20> a = fun();
std::vector<BigThing> b { std::make_move_iterator(a.begin()),
                          std::make_move_iterator(a.end())) };
TartanLlama
  • 63,752
  • 13
  • 157
  • 193
  • 3
    Well you can use a custom allocator, like Howard Hinnant's `short_alloc`. The main problem I see is that `vector` manages its own size, and you need hacks to initialize it with an already filled array (like `resize` plus a custom allocator::construct that doesn't perform any initialization when asked for value-init). – dyp Dec 01 '15 at 11:35
  • @dyp, yes, I though about the custom allocator that will *simulate* memory allocation from the provided array. The problem (maybe unavoidable) is that at any point (depending on the scope) `std::array` itself can disappear. (I didn't mention the custom allocator hack to keep the question simple). One option could be to somehow make the the array destructor call "disappear", but even if that were possible it would violate some constrains in how the stack behaves (see @Bartek's answer). – alfC Dec 01 '15 at 11:43
  • I didn't know about `move_iterator` http://en.cppreference.com/w/cpp/iterator/move_iterator – alfC Dec 01 '15 at 11:44
  • @dyp is that possible without breaking the contract of an allocator? – TartanLlama Dec 01 '15 at 11:45
  • @TartanLlama, what would be the contract of an allocator? (Honest question: can one have a allocator that doesn't allocate but simply returns an existing pointer?) – alfC Dec 01 '15 at 11:46
  • @alfC there are a bunch of requirements on how allocators should behave outlined in `[allocator.requirements]` in the spec. I'm not at all familiar with the requirements, hence the question. – TartanLlama Dec 01 '15 at 11:47
  • I remember there was at least one SO question about a faster resize by using default-init instead of value-init. Currently, the contract of allocator::construct is under-specified anyway, I think it is valid if you "return" an already constructed object. – dyp Dec 01 '15 at 11:49
  • 1
    Found it: http://stackoverflow.com/a/21028912/ see also: http://stackoverflow.com/a/15975738/ – dyp Dec 01 '15 at 11:55
5

Is there a way to move memory from the stack to the heap (whatever that means) and if that can be combined with a move constructor?

I personally love the "whatever that means" bit. Let's ponder on this for a while. Moving something from stack to heap would suddenly mean that that part of the stack suddenly becomes marked as a heap-allocated region and subject to regular destruction.

The problem with that is that stack is contiguous and is destroyed by, well popping things off it. You can't just say "hey, leave this memory bit out" - any consecutive stack allocation and deallocation would need to jump "over" that part.

To illustrate:

|                      |
|----------------------|
| stack block 1        |
|----------------------|
| your vector          |
|----------------------|
| stack block 2        |
|----------------------|
|-                    -|

If you wanted to unwind those two blocks, you'd need to first decrease the stack pointer by the size of the block 2 pointer, then by the size of your vector and the block 1. This really isn't something that can happen.

Hence, the only feasible solution here is to make a copy into the heap memory region. However, those copies are much faster than a lot of people expect. Even if the vector has a few megabytes, the memory controller can just swap around some pages, I suppose, and not have to physically send electrical signals corresponding to the bits of the data.

Besides, any resize of the vector would need to cause a reallocation either way. Since the array occupies precisely as much memory as it needs, even adding a single element would trigger the copy you're trying to avoid so much.

Bartek Banachewicz
  • 38,596
  • 7
  • 91
  • 135
  • Great answer. I guess everything boils down to "stack and heap can only communicate by copying". An interesting detail is that in the use case the memory from the array is at the top of stack while transferred (copied or moved) to the `std::vector. – alfC Dec 01 '15 at 11:37
  • Yes, if the vector grows later the optimization will be lost because the memory will have to be reallocated. But only then, if it stays the same then it is still ok. – alfC Dec 01 '15 at 11:38
  • 1
    _"Even if the vector has a few megabytes, the memory controller can just swap around some pages, I suppose, and not have to physically send electrical signals corresponding to the bits of the data."_ I don't think so. By the time the vector starts copying bits the pages will already have been allocated by `malloc` (or similar) and the OS has no idea when reserving those pages that they will eventually end up containing the same data as some part of the stack. You're also making assumptions about being aligned to page boundaries, and fitting several megabytes on the stack. – Jonathan Wakely Dec 01 '15 at 11:45
  • 1
    It's also possible to have non-contiguous stacks (search for "split stacks" or "segmented stacks") but it wouldn't really help in this case. – Jonathan Wakely Dec 01 '15 at 11:49