1

I need to find a way to intentionally leak (take ownership of) the internal pointer of a std::vector so that its lifetime surpasses the one of the original container and so that it can be later deleted manually.

Why? I'm working on a networked application using the C ENet library that needs to send large amounts of packets in a short amount of time.

I create network messages by writing the data to a std::vector<unsigned char>.

Then in order to create a "packet," I use the enet_packet_create function, which takes a pointer to a byte array to be sent and its size. In normal mode of operation, the function simply dynamically duplicates the given array on the heap, but there is also a "no allocate" option which only takes the pointer and size, leaving deleting to the creator using a callback function, and that's exactly what I'm trying to achieve -- the data is already there in the vector ready to be used, so there is no need to copy it again, as it could be costly.

0x400921FB54442D18
  • 725
  • 1
  • 5
  • 18
  • 9
    *I need to find a way to intentionally leak the internal pointer of a std::vector so that its lifetime surpasses the one of the original container and so that it can be later deleted manually.* -- Sounds like the ultimate [XY Problem](http://xyproblem.info/). – PaulMcKenzie Oct 16 '19 at 14:06
  • Unless ENet is bugged, you should just use `new[]` and `delete[]` in this case. – Quentin Oct 16 '19 at 14:10
  • 4
    I would consider extending the vector’s lifetime instead. – molbdnilo Oct 16 '19 at 14:10
  • 4
    What exactly stops you from keeping the underlying `std::vector` around while its `data()` pointer is in use by `enet_packet_create`? – Max Langhof Oct 16 '19 at 14:11
  • @PaulMcKenzie 3/4 of my question explains X. – 0x400921FB54442D18 Oct 16 '19 at 14:11
  • 2
    And we are telling you that based on your description of `X`, the `Y` you're gunning for is the wrong approach. If you're hellbent on using the footgun you've thought of, your question is answered here: https://stackoverflow.com/questions/56127946/can-i-force-stdvector-to-leave-a-memory-leak?rq=1. But I would advise you to reconsider. – Max Langhof Oct 16 '19 at 14:14
  • 2
    According to documentation, `ENetPacket` has a `void* userData` member that may be freely used by the application. Can't you simply store a pointer to the `std::vector` in there and delete it in the callback? – walnut Oct 16 '19 at 14:18
  • 1
    Yes, but you started off with the 'Y' -- even the title of your post is 'Y' oriented and raises eyebrows by anyone browsing the C++ question list. – PaulMcKenzie Oct 16 '19 at 14:21
  • @uneven_mark great point, but that forces me to wastefully dynamically allocate the vector too. – 0x400921FB54442D18 Oct 16 '19 at 14:21
  • @MaxLanghof Sure, I could write it on my own with `new` and `delete`, but why reinvent the wheel when this problem has been already efficiently solved in the std library? – 0x400921FB54442D18 Oct 16 '19 at 14:22
  • @0x400921FB54442D18 Then I don't see any way other than using raw dynamic allocated arrays. `std::vector` does not allow you to transfer ownership of the underlying allocation freely. – walnut Oct 16 '19 at 14:24
  • @0x400921FB54442D18 I see that you have considered the available alternatives. Avoiding dynamic allocation of the vector is (plausibly) a valid reason/constraint, and I think if you mentioned that in the question then there would be less raised eyebrows ;) Writing your own dynamic array that you can steal/release the allocated array from may be the next best alternative then. As stated, `std::vector` provides no such interface. – Max Langhof Oct 16 '19 at 14:24
  • @PaulMcKenzie Rust's Vec and Box have a way to leak the memory in this fashion, so I thought C++ could have something similar. I guess not. – 0x400921FB54442D18 Oct 16 '19 at 14:24
  • 1
    If the overhead of dynamically allocating vectors is a problem you can resort to the typical solution of a vector pool containing persistent vectors which are allocated once (possibly statically, if you have an upper limit) and are never freed. "Freeing" here would be returning them to the pool for further use, without actually de-allocating any memory. – Peter - Reinstate Monica Oct 16 '19 at 14:28
  • @0x400921FB54442D18 -- Vector's memory is controlled entirely by vector and whatever allocator was used. There is no "give me ownership of your memory" or "here, own this memory I created temporarily" concept for vector. Your best bet is to keep a vector alive and just reuse it when you need to. Probably your code would have a slight speed improvement by holding onto a single vector and just issuing `resize()` calls instead going through the gauntlet of creating a vector from scratch each time. – PaulMcKenzie Oct 16 '19 at 14:31

4 Answers4

2

You don't need to leak anything. Just use the userData field of the ENetPacket structure to store the to-be-deleted std::vector, and just delete it in the callback:

void myCallback(ENetPacket *pkt) {
    std::vector<uint8_t> *data=(std::vector<uint8_t> *)pkt->userData;
    delete data;
}

void sendData() {
    //Create the vector in heap, so it is not destroyed after returning from this function, effectively extending its life until the callback is called.
    std::vector<uint8_t> *data=new std::vector<uint8_t>;
    //Fill data here
    ENetPacket *pkt=enet_packet_create(data.data(), data.size(), ENET_PACKET_FLAG_NO_ALLOCATE);
    pkt->userData=(void*)data;
    pkt->freeCallback=myCallback;

}

The userData void pointer is a usual strategy to hold opaque user data and use it in callbacks, so the user of the library can retrieve the context in which the callback has been called.

It can be anything (void*), from a state holder structure in order to do complex logic after the callback, or just a data pointer which needs to be freed like your case.


From your comments, you say that you don't want to dynamically allocate the vector.

Just remember that any data inside the vector has been dynamically allocated (unless a custom allocator has been used) and the ENetPacket structure has also been dynamically allocated (the passed flag just indicates not to allocate the data, not the structure)


Finally, if you know (or can precompute) the size of the data, a different approach would be to create the packet passing a NULL data pointer.

The function enet_packet_create will create the data buffer, and you can just fill the data directly in the packet buffer, without needing a different buffer and then copying it to the packet.

LoPiTaL
  • 2,495
  • 16
  • 23
2

This approach is not possible, even if vector<T> provided an interface to let you abscond with its memory. Let's get into why.

Your problem exists because the site where you're going to free the memory is not given arbitrary data. It is only given a pointer to the memory to be freed. If this were not the case, then you'd just pass a pointer to the vector<T> itself to this location, or otherwise smuggle in a vector<T> object itself.

In order to abscond with a vector<T>'s memory and successfully free it, you would have to play by vector<T>'s rules. Which means:

  1. You have to respect the size/capacity distinction. Not all of the memory allocated for a vector<T> actually contains live Ts. So you have to know how many live Ts there are in that memory, so that you can call their destructors properly (we'll get to an issue with that later).

    Now sure, for the very specific case of unsigned char, calling destructors is irrelevant, since they're trivial. But vector<T>'s interface needs to be uniform; if you can abscond with a vector<unsigned char>'s memory, then you must be able to abscond with any vector<T> in the same way. So any absconding interface must provide not just a pointer to the data, but also the size and capacity so that you can properly destroy the members of the container.

  2. You have to respect the Allocator. Remember: the template is vector<T, Allocator>, where Allocator is the type that does the memory allocation/deallocation, as well as creating/destroying the actual Ts in the vector. And since you're allowed to provide specific objects of a particular Allocator instance, any absconding interface must store that specific Allocator object (or copy/move thereof) so that the allocation can be freed.

    Again, the specific case of vector<unsigned char> doesn't care, because the default allocator std::allocator just uses ::operator new/delete to allocate/deallocate memory, and direct placement-new/destructor calls to create/destroy the Ts. But again, a general absconding interface must work with any T and any Allocator. So it must account for all of that.

Which means that, at the end of the day, when you abscond with a vector's memory, that interface must provide an object that stores a pointer to the allocation, the number of live elements in that allocation, the size of that allocation (since the Allocator interface requires that), and the Allocator instance (or copy/move thereof) to use to destroy/deallocate the object.

In short, absconding with a vector<T, Allocator>'s memory means creating a vector<T, Allocator>.

Which you can't do, as stated above. You have arrived at an inherently contradictory situation.

There are two solutions:

  1. Change your code so that you can smuggle in a vector<T> to the location that . This could be done via some global/class-scoped/etc map from pointer-to-data to a vector<unsigned char>*. Or some other mechanism. You'll have to figure it out, because it depends on specific aspects of the system that you have not presented (this is the definition of the XY Problem).

  2. Stop using vector<unsigned char>. Instead, just heap-allocate an array of unsigned char, which you can destroy just fine.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
1

I need to find a way to intentionally leak the internal pointer of a std::vector

Only way to leak the internal buffer of std::vector is to leak the vector itself. Example:

std::vector<T>* ptr = new std::vector<T>;
ptr = nullptr; // memory leaked succesfully

But leaking memory is not a good idea in general.

I did not literally mean to create a memory leak, the memory needs to be freed.

In this case, the only solution is to make sure that the lifetime of the std::vector is longer than the usage of the buffer. A vector always releases the buffer it owns on destruction, and there is no way to extract ownership from it except into another vector.

One way to achieve that is this:

// stored somewhere with guaranteed longer lifetime than any packet
std::unordered_map<unsigned char*, std::vector<unsigned char>> storage;

void foo()
{
    std::vector<unsigned char> vec;
    // fill vec here
    unsigned char* ptr = vec.data();
    storage[ptr] = std::move(vec);
    auto destroy_callback = [](unsigned char* ptr) {
        storage.erase(ptr);
    }
    // pass ptr and destroy_callback into some async API
}

You could use a pool allocator to avoid redundant allocations for each packet.

Example adapted form this answer (now that this question has shifted from leaking to transferring ownership, this is close to a duplicate). There's also an alternative suggestion in another answer to that same question which uses a custom allocator that "steals" the ownership

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 1
    This creates a memory leak of `sizeof(std::vector)`. – 0x400921FB54442D18 Oct 16 '19 at 14:08
  • I did not literally mean to create a memory leak, the memory needs to be freed. Perhaps saying that I want to take ownership of the containers memory would be better. – 0x400921FB54442D18 Oct 16 '19 at 14:09
  • @0x400921FB54442D18: In order to take ownership of that memory, you'd need to keep something around that would allow you to free it, right? Like a pointer to the memory? So why can't you just keep the *vector itself* around instead of a pointer? – Nicol Bolas Oct 16 '19 at 14:15
  • @NicolBolas Where do I store the vector so that it can be freed? ENetPacket needs `unsigned char* data` and `size_t dataLength` (which is the raw data to be sent to the other side) alongside a callback that frees `data`. – 0x400921FB54442D18 Oct 16 '19 at 14:17
  • @0x400921FB54442D18 `std::vector` just isn't designed to that it. – François Andrieux Oct 16 '19 at 14:18
  • @0x400921FB54442D18 A more or less straightforward solution would be a map from `unsigned char*` to `std::vector`, with the vector value being moved in (so no copy/reallocation happens during construction of the map entries). But that's certainly not the only way. – Max Langhof Oct 16 '19 at 14:18
  • 1
    Maybe you could keep the `std::vector` in a `std::map>`. Then the freeing callback could look up the vector in the map and erase it. – François Andrieux Oct 16 '19 at 14:19
  • 1
    @0x400921FB54442D18: "*@NicolBolas Where do I store the vector so that it can be freed?*" That's between you, your networking library, and the architecture of your system. – Nicol Bolas Oct 16 '19 at 14:26
0

The following is not an answer! It's yet another attempt to convince you to rethink your approach but it's too long for a comment. (Having said that, I must say that I love this type of hacks when it's just for fun but I hate them even more strongly when they go to production code.)

From the OP, the motivation to use the "no alloc" option is to avoid memory allocation and copying bytes inside enet_packet_create. This brings me the question why using a vector?

If you create a vector but do not fix its its capacity (with reserve or resize) from the beginning and, instead, let it to increase as you add elements, then each time capacity is increased vector will allocate memory and copy bytes which is exactly what you want to avoid.

Perhaps you know from the beginning what the final size of the vector will be. In this case, you can avoid all copies and memory allocations (but one) by reserving that size from the beginning. In this case why not simply using a new[] and delete[] as Quentin has suggested? You wouldn't have to steal memory since it would be yours. Even better, you can create an unique_ptr<unsigned char[]> (consider make_unique<unsigned char[]>), use its release method just before calling enet_packet_create to "steal" the memory and later call delete[] to free the memory.

Cassio Neri
  • 19,583
  • 7
  • 46
  • 68