12

Lets say I have a function to get data into an std vector:

void getData(std::vector<int> &toBeFilled) {
  // Push data into "toBeFilled"
}

Now I want to send this data to another function, that should free the data when finished:

void useData(int* data)
{
  // Do something with the data...
  delete[] data;
}

Both functions (getData and useData) are fixed and cannot be changed. This works fine when copying the data once:

{
  std::vector<int> data;
  getData(data);
  int *heapData = new int[data.size()];
  memcpy(heapData, data.data(), data.size()*sizeof(int));
  useData(heapData);
  data.clear();
}

However, this memcpy operation is expensive and not really required, since the data is already on the heap. Is it possible to directly extract and use the data allocated by the std vector? Something like (pseudocode):

{
  std::vector<int> data;
  getData(data);
  useData(data.data());
  data.clearNoDelete();
}

Edit:

The example maybe doesn't make too much sense, since it is possible to just free the vector after the function call to useData. However, in the real code, useData is not a function but a class that receives the data, and this class lives longer than the vector...

Jan Rüegg
  • 9,587
  • 8
  • 63
  • 105
  • 4
    I'm afraid you can't do that. No way the vector can be emptied without its memory being released... – jpo38 Oct 23 '14 at 09:48
  • 24
    What sort of mad API is this?! – Lightness Races in Orbit Oct 23 '14 at 09:49
  • 1
    @LightnessRacesinOrbit +1 to that, to be sure. And Jan, you don't *know* that data is on the heap. The standard only mandates it is contiguous and random-accessable (and a few other things). Like `std::string`, it would not be unheard of for a small-item-count vector with a reasonable small object-static buffer for placement-new, resorting to fully-dynamic once that page is deemed too small. Such an implementation would blow up *severely* under the auspices of the usage you seek. – WhozCraig Oct 23 '14 at 09:49
  • @Lightness, the "useData" function is actually an image class that takes a raw pointer for its data. This makes it possible to take data coming from anywhere without making a copy of it, and freeing it in the destructor of the image class... – Jan Rüegg Oct 23 '14 at 09:50
  • Instead of having a vector, you could create a vector, you'd add an extra indirection that should solve your problem. The objects will be destroyed by useData. Be really careful not tu use the pointers after calling useData though. – dau_sama Oct 23 '14 at 09:54
  • @dau_sama: No, because now his data is not in the proper format for `useData` and he still needs to reconstruct them in contiguous form. – Lightness Races in Orbit Oct 23 '14 at 09:56
  • Are you bound to just `std::vector`, or could you provide a vector with a different allocator (i.e. a `std::vector`)? – Angew is no longer proud of SO Oct 23 '14 at 09:58
  • @Angew: Hmm... thats a good idea with the allocator. If its possible to use a custom allocator vector with the function getData, without changing the signature of getData signature, then it should be fine... but will this work? – Jan Rüegg Oct 23 '14 at 10:00
  • @JanRüegg If the signature is exactly as you gave it, then unfortunately a custom allocator is not an option. – Angew is no longer proud of SO Oct 23 '14 at 10:07
  • @JanRüegg: It's not, because the Allocator is part of the `std::vector` type (as a template parameter) and you've said `useData` cannot be changed. [This question](http://stackoverflow.com/q/26521012/560648) is posted with interestingly good timing. – Lightness Races in Orbit Oct 23 '14 at 10:11
  • 1
    Well, you could always new() that vector so that its destructor is never implicitly called. I take it that the image is big enough to exclude stack allocation of the vector's data. I'd also assume (but one would need to verify) that useData()'s delete[] (as I assume) is compatible with std::vector's allocation. One could never call the vector's destructor though because it would attempt to free that memory again. Therefore the dynamically allocated vectors would be a memory leak. If you call useData() 26 times a second that may become a problem. – Peter - Reinstate Monica Oct 23 '14 at 10:18
  • Well, actually you could use a custom allocator for the vecor itself which would take care of the memory leak (assuming one could re-use the space taken up by the vector -- not its data! -- after its use). – Peter - Reinstate Monica Oct 23 '14 at 10:20
  • 3
    @WhozCraig: The current standard does not permit "small vector optimization" because the move operations must not invalidate iterators and moving a small vector would require a copy. – MadScientist Oct 23 '14 at 12:18
  • It is possible for vector to allocate more memory than asked, and use the beginning of that memory to store size and capacity. In that case, `data()` returns a pointer in the middle of the allocated region, and passing it to `delete` may crash before even reaching the destructor. – Marc Glisse Oct 24 '14 at 10:33

2 Answers2

26

No.

The API you're using has a contract that states it takes ownership of the data you provide it, and that this data is provided through a pointer. This basically rules out using standard vectors.

Vector will always assuredly free the memory it allocated and safely destroy the elements it contains. That is part of its guaranteed contract and you cannot turn that off.

You have to make a copy of the data if you wish to take ownership of them... or move each element out into your own container. Or start with your own new[] in the first place (ugh) though you can at least wrap all this in some class that mimics std::vector and becomes non-owning.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Hmm... I though maybe there is some way to do "swap(data, heapData)" or similar... – Jan Rüegg Oct 23 '14 at 09:51
  • 2
    *"Vector will always assuredly free the memory it allocated and safely destroy the elements it contains. ...you cannot turn that off."* - you can placement `new` an empty `vector` over the original `vector` such that the destructor - when run - doesn't know about the earlier allocation; the real problem here is the default allocator doesn't use `new[]`, even if it did `.data()` might not yield the same value, so `delete[]` on the `.data()` value is unsafe, and writing a custom allocator to fix all that involves unavoidable inefficiencies + a `reinterpret_cast` to call `getData()`. – Tony Delroy Oct 23 '14 at 12:22
  • 4
    @TonyD: That's verrrry UB. I consider it a false solution. – Lightness Races in Orbit Oct 23 '14 at 14:27
  • 1
    @TonyD - the real problem here is that `std::vector` owns its data. Period. Sure, you can hack around with undefined behavior and get something that seems to work, but if you do that, you're on your own. – Pete Becker Oct 23 '14 at 15:46
  • 1
    "I consider it a false solution" - I was actually explaining that despite the fact many parts of the problem can be solved, there isn't a portable/Standard solution: consider ".data() might not yield the same value" + need for "a `reinterpret_cast`. As Pete says - those are consequences of the `vector`'s ownership, but still different from your narrower and false claim that "you cannot turn [vector's freeing the memory and destroying the elements] off". – Tony Delroy Oct 23 '14 at 16:34
7

Here's a horrible hack which should allow you to do what you need, but it relies on Undefined Behaviour doing the simplest thing it can. The idea is to create your own allocator which is layout-compatible with std::allocator and type-pun the vector:

template <class T>
struct CheatingAllocator : std::allocator<T>
{
  using typename std::allocator<T>::pointer;
  using typename std::allocator<T>::size_type;

  void deallocate(pointer p, size_type n) { /* no-op */ }

  // Do not add ANY data members!!
};


{
  std::vector<int, CheatingAllocator<int>> data;
  getData(reinterpret_cast<std::vector<int>&>(data)); // type pun, `getData()` will use std::allocator internally
  useData(data.data());
  // data actually uses your own allocator, so it will not deallocate anything
}

Note that it's as hacky and unsafe as hacks go. It relies on the memory layout not changing and it relies of std::allocator using new[] inside its allocate function. I wouldn't use this in production code myself, but I believe it is a (desperate) solution.


@TonyD correctly pointed out in the comments that std::allocator is quite likely to not use new[] internally. Therefore, the above would most likely fail on the delete[] inside useData(). The same @TonyD also made a good point about using reserve() to (hopefully) prevent reallocation inside getData(). So the updated code would look like this:

template <class T>
struct CheatingAllocator : std::allocator<T>
{
  using typename std::allocator<T>::pointer;
  using typename std::allocator<T>::size_type;

  pointer allocate(size_type n) { return new T[n]; }

  void deallocate(pointer p, size_type n) { /* no-op */ }

  // Do not add ANY data members!!
};


{
  std::vector<int, CheatingAllocator<int>> data;
  data.reserve(value_such_that_getData_will_not_need_to_reallocate);
  getData(reinterpret_cast<std::vector<int>&>(data)); // type pun, `getData()` will use std::allocator internally
  useData(data.data());
  // data actually uses your own allocator, so it will not deallocate anything
}
Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • Thanks a lot for this (very creative) solution :D But you're right, I'll probably have to go with some sort of big refactoring instead... – Jan Rüegg Oct 23 '14 at 10:26
  • There are a couple problems with this... first, if `getData()` does anything to trigger a `resize`, the deactivated `deallocate` will leak the old memory region after data is copied to the new region. Secondly, allocators aren't expected to use `new[]`, and your custom one's inherited `allocate` function therefore can't be `delete[]`-ed even if the same pointer happens to be yielded by `.data()`. If the custom allocator invokes `new[]` itself, and also no-ops `.contruct` and `.destroy`, I think you're left with the `reinterpret_cast` be the only source of undefined behaviour - pretty good. – Tony Delroy Oct 23 '14 at 12:06
  • 1
    @TonyD I think there's a lot more UB in there. `getData()` will *not* invoke `CheatingAllocator`, it will (presumably) invoke all operations on `std::allocator`, as it believes to operate on `std::vector>`. That's why I said it relies on `std::allocator` using `new[]` internally (which, given the header-only nature of `std`, can at least be verified). – Angew is no longer proud of SO Oct 23 '14 at 12:16
  • 1
    Solid point, though in some cases a prior `reserve()` might be a practical way to avoid allocator use by `getData()`. I don't think any sane implementation of `std::allocator` would use `new[]` internally - that would mean a `reserve` had to default construct all elements, and `capacity()` rather than `size()` destructors running during cleanup, also hard to reconcile with the need for distinct `.construct` and `.destroy` members. – Tony Delroy Oct 23 '14 at 16:39