3

I have a std::vector<byte> object and I want to extract data from it without copying. It may contain megabytes of data. So, if I copy data I would lose performance. Is it possible to extract the data from the vector and make it forget about data, that is, that it doesn't free memory for the data after destruction? Hope for your help! Thanks in advance!

P.S: extract in this case means just get a raw pointer to the data and make vector forget about it (i.e don't free the memory after destruction)

zenno2
  • 425
  • 2
  • 7
  • You mean _all_ the data, or just some of it? You can just swap the contents with an empty vector if you want to take everything ... – Useless Jul 12 '21 at 14:24
  • @Useless Since he is used *extract* term, I think he mean a part of data. – Afshin Jul 12 '21 at 14:25
  • 1
    That's why I asked for clarification. – Useless Jul 12 '21 at 14:26
  • 1
    @Useless, I mean all the data, but I want to extract it as a raw pointer, but not in other vector – zenno2 Jul 12 '21 at 14:30
  • 3
    There are no way to acquire memory used by vector. – Jarod42 Jul 12 '21 at 14:35
  • Why not just move around a vector? Moving is cheap. – Marek R Jul 12 '21 at 14:36
  • 1
    your question reads a little bit like "I want to copy elements of a vector, but I don't want to copy elements of the vector" ;). Consider whether you need copies in the first place. When you pass iterators instead of containers, algorithms can seemlessly work on subranges of the whole container without ever making a single copy. – 463035818_is_not_an_ai Jul 12 '21 at 14:37
  • Since `std::vector`s have contiguous memory, making it forget a part in the middle isn't possible. You could perhaps use a `std::list` instead. – Ted Lyngmo Jul 12 '21 at 14:39
  • @463035818_is_not_a_number, I get a vector from a library that I can't change, but I don't want to operate with vectors in my code, I want to get a raw pointer to the data and operate with it – zenno2 Jul 12 '21 at 14:41
  • 1
    @zenno2 You can use iterators. Perhaps `std::span` would help too. – Ted Lyngmo Jul 12 '21 at 14:46
  • 2
    thats all no reason against my advice. A raw pointer is an iterator. You can get pointers to subranges in a vector. It isnt obvious why you want to copy or extract something – 463035818_is_not_an_ai Jul 12 '21 at 14:46
  • Does this answer your question? [Can I std::move() an element out of a std::vector?](https://stackoverflow.com/questions/23118391/can-i-stdmove-an-element-out-of-a-stdvector) – francesco Jul 12 '21 at 15:22

4 Answers4

4

No, It is not possible to extract part of data from vector as far as I know.

It is not compatible with structure of vector that provides its data in a continuous part of memory. std::vector memory is continues, so if it was possible to move part of its memory to another place, you need to shift reminder of memory to keep it continuous. It will be a huge burden itself.

I personally suggest to pass main vector by pointer/reference and use required parts directly as needed.

If you need to move whole data of std::vector to another place, you can just use std::move() to do so. You can even use std::swap() to swap contents of 2 vector together.

Afshin
  • 8,839
  • 1
  • 18
  • 53
  • well, one can move a range of elements and then erase that elements from the vector, I think this would fit what OP calls "extract" – 463035818_is_not_an_ai Jul 12 '21 at 14:31
  • 1
    @463035818_is_not_a_number moving a `std::byte` is a copy though, so in this case that ability doesn't help :( – NathanOliver Jul 12 '21 at 14:32
  • @463035818_is_not_a_number but after erase, elements of main vector are copied/moved. So I'm not sure that there will be much benefit to do so. I think passing main vector my pointer/reference and using required part is better. – Afshin Jul 12 '21 at 14:35
  • 1
    @NathanOliver oh i missed the `std::byte` detail. – 463035818_is_not_an_ai Jul 12 '21 at 14:35
2

I have a std::vector object and I want to extract data from it without copying

You can move the entire contents of a vector ... into a different vector. Or you can swap (the contents of) two vectors.

std::vector<byte> v = get_a_big_vector();
std::vector<byte> w = std::move(v); // now w owns the large allocation
std::vector<byte> x;
std::swap(x,y); // now x owns the large allocation, and w is empty

That's it. You can't ask a vector to release its storage, and you can't somehow "take" just a portion of a contiguous allocation without affecting the rest.

You can move-assign some sub-range of elements, but that's only different to copying if the elements are some kind of object with state stored outside the instance (eg, a long std::string).

If you really need to take just a sub-range and let the rest be deallocated, then a vector isn't really the right data type. Something like a rope is designed for this, or you can just split your single contiguous vector into a vector of 1Mb (or whatever) chunk indirections. This is actually something like a deque (although you can't steal chunks from std::deque either).

Useless
  • 64,155
  • 6
  • 88
  • 132
1

The Pointer Heist

This method utilizes a union to take steal the data pointer in the std::vector, and then prevents the call to the destructor.

Releasing the memory correctly is the tricky part... and not covered here.. but you need functionality similar to what a std::vector provides.

I strongly recommend you use std::move to change ownership instead.
But perhaps this technique can be used for other purposes as well?

template<typename TElement>
TElement* the_pointer_heist(std::vector<TElement>& victim)
{
    union Theft
    {
        std::vector<TElement> target;
        ~Theft() {}
    } place_for_crime = {std::move(victim)};
    return place_for_crime.target.data();
}
Betaloid
  • 31
  • 4
-2

I think the best way is to use an object orineted approach. You can abstract byte data inside a class with other information like a flag to make them to be skipped or forget:

class Data
{
public:
   Data(byte d)
   {
       data = d;
       forget = false;
   }
   byte data;
   bool forget;
}

Then just add to the vector pointers to data like

vector<Data*> data;
data.push_back(new Data(1));    
data.push_back(new Data(2));
// and so on

You can extract data without copying just getting the pointers to specific element of the array:

Data *d = data[index];
d->forget = true;

You can use the forget flag to make it forgettable. Of course you have to manage the forget flag yourself when searching the vector. You can use the std::find_if with a lamba expression for this porpouse.

Keep in mind you have to free memory when data is not used any more.

Marco Beninca
  • 605
  • 4
  • 15
  • 1
    This requires changing the `std::vector` which OP does not have control over, so they can't use this. Besides that, this solution makes the byte data non-contiguous (each byte is separated by a `bool` and potentially by padding). And this doesn't perform any extraction. The point is to take ownership of the underlying data, to prevent `std::vector` from freeing it when it is destroyed. Finally, this solution proposes using a raw owning pointer. If dynamic object creation was useful here, it should use `std::unique_ptr` instead. But it isn't, you don't allocate each byte individually. – François Andrieux Jul 12 '21 at 14:54