C++ dynamic loading of classes: Why is a "destroy" function needed?

Question

This page examines and gives a very clear example of how to dynamically load and use a class, there is something that I have a hard time understanding though:

I understand why is the "create" function needed, but why is a "destroy" function needed? why is not declaring the interface destructor as pure virtual enough?

I made an identical example with the exception of:

~polygon() = 0;

The destructor for triangle is:

triangle::~triangle() {
    std::cout << "triangle Dtor is called" <<std::endl;
}

then when I use:

delete poly;

the message is indeed shown (GCC 5.4.0 under linux).

I tried to look for other examples but they all mention and use the "destroy" function, there were no example using simply pure virtual destructors, which makes believe I'm missing something here, so .. what is it?

The background of not wanting to use a destroy function is that I want to use the allocated object in a shared_ptr and not care later about its lifetime, working with a "destroy" function will be tricky, therefore I need to know if it's necessary.

"working with a "destroy" function will be tricky" - why? If it's tricky, then you're probably doing something wrong. — UKMonkey, Oct 18 '16 at 13:09
Possible duplicate of [Question about pure virtual destructor](http://stackoverflow.com/questions/999340/question-about-pure-virtual-destructor) — StoryTeller - Unslander Monica, Oct 18 '16 at 13:09
You are sure you want to specifically load a class from a dynamic object? Not just create a instance of this class? `dynamic_memory` in c++ and `dlopen` are not related to each other (just wanna make sure I understand what you want to do) — Hayt, Oct 18 '16 at 13:10
@Hayt "dynamic_memory in c++ and dlopen are not related to each other" But "memory management" and "shared_ptr" are related, right? — Adrian Colomitchi, Oct 18 '16 at 13:13
There's no concept of "dynamically loading a class". Classes exist at compile time only. You're simply loading a compatible implementation. — xaxxon, Oct 18 '16 at 13:19
@UKMonkey because it will require some restructuring of other things that are not dynamically loaded, but the answer from Hayt below solves the problem I guess — Mystic Odin, Oct 18 '16 at 13:24
@StoryTeller the question is not why do I need an implementation of a pure virtual destructor, it's about why do I need a destroy function if the actual destructor is called and the abstract interface is pure virtual, this is not a duplicate — Mystic Odin, Oct 18 '16 at 13:25
@Hayt I tagged dynamic-loading not dynamic memory, I'm trying to use a class (of course by instantiation) from a shared object, the page section is named "loading classes", I did not think the terminology may be wrong — Mystic Odin, Oct 18 '16 at 13:33
@AdrianColomitchi I think the "destroy" function is fairly related to memory management. — Mystic Odin, Oct 18 '16 at 13:34
I was just asking because in your question you did not mention `dlopen` etc but just in the link you provided. So when the link is dead it will be not useful anymore and people may not click the link and think you just switched the words up. If this is about `dlopen` etc. maybe mention it in the text of your question. — Hayt, Oct 18 '16 at 13:37
Whoever is voting to close this as a duplicate of [Question about pure virtual destructor](https://stackoverflow.com/questions/999340/question-about-pure-virtual-destructor), that's not remotely a duplicate. This touches on virtual destructors, but it's asking why they're insufficient for cross-binary `delete`, not why they're needed in the first place. — ShadowRanger, Oct 18 '16 at 14:03

score 5 · Accepted Answer · edited Oct 18 '16 at 16:45

5

Read a little further in the same link:

You must provide both a creation and a destruction function; you must not destroy the instances using delete from inside the executable, but always pass it back to the module. This is due to the fact that in C++ the operators new and delete may be overloaded; this would cause a non-matching new and delete to be called, which could cause anything from nothing to memory leaks and segmentation faults. The same is true if different standard libraries are used to link the module and the executable.

The keyword here is new and delete may be overloaded and therefore do something different in the code of the caller than in the code of the shared object, if you use delete from inside the binary it will call the destructor and it will deallocate the memory according to the destructor in the shared object, but that might not be the behavior of delete operator in the shared object, maybe new in the shared object did not allocate any memory and therefore you will have a possible segmentation fault, and maybe new is doing something more than allocate the memory for that object and by not calling the matching delete in the shared object there is a leak, there is also the possibility of different heap handling between the shared object and binary.

In any event shared_ptr can be given a custom deleter fairly easily with a lambda function that calls the custom deleter; true, it's mildly annoying that shared_ptr can't include the deleter in its template arguments, but you can write a simple wrapper to make it simpler/less verbose to create it with a consistent deleter in all locations (no compiler available right now, forgive any typos):

shared_ptr<triangle> make_shared_triangle(triangle *t) {
    return std::shared_ptr<triangle>(t, [](triangle *t) { destroy_triangle(t); });
}

edited Oct 18 '16 at 16:45

Mystic Odin

269
1
2
13

answered Oct 18 '16 at 13:08

ShadowRanger

143,180
12
188
271

Yes, but how? I mean if `polygon` is compiled and linked in the binary or a shared object, and both the binary and `triangle` shared object use its same declaration/implementation wouldn't the deallocation be done there anyway based on the vtable resolving the `triangle` destructor? – Mystic Odin Oct 18 '16 at 13:18
2

@MysticOdin: The scenario described is where the executable is explicitly dynamically loading functions to create (and destroy) an object it doesn't have the implementation for, which may have a different `delete`. Consider a binary with one `new`/`delete` pair, and a shared object with another (the SO's we call `newweird`/`deleteweird`). The factory function in the SO calls `newweird`. If you just let a `shared_ptr` from the binary call `delete`, it's not using `deleteweird`; you've got a mismatch (allocated w/`newweird`, deleted w/`delete`) that can corrupt the heap or segfault your program. – ShadowRanger Oct 18 '16 at 13:47
Basically, `delete` does two things: 1. Invoke the destructor for the instance (can be found by vtable; I think this works fine) 2. Release the memory via `operator delete` (can have completely different definition in the shared object, either due to being replaced or because the shared object links a different version of the standard lib). #2 isn't visible to the original binary at all. – ShadowRanger Oct 18 '16 at 13:53
OK I understand that the deallocation is basically glued to the end of the destructor (in case it has an implementation, otherwise it's actually put in front), and basically what you're saying is that this wrong and allocation/deallocation could be overloaded and implemented weirdly, resolving the destructor in the vtable does not resolve the deallocation, but only calls the destructor function and the deallocation is done through the binary not the shared object, is this correct? – Mystic Odin Oct 18 '16 at 14:07
@MysticOdin: Exactly. – ShadowRanger Oct 18 '16 at 14:13
actually no, the deallocation is done as expected in the shared object's code, the thing which I failed to understand till now, is that `new` and `delete` are operators which may be overloaded to do something entirely different than calling the constructor/destructor and allocating/deallocating memory, if I use `delete` from inside the binary it will call the destructor and it will deallocate the memory but that might not be the behavior of `delete` operator in the shared object ... – Mystic Odin Oct 18 '16 at 14:33
... maybe `new` in the shared object did not allocate any memory and therefore I will have a possible segmentation fault, and maybe `new` is doing something more than allocate the memory for that object and by not calling the matching delete in the shared object there is a leak, with your permission I would like to add this clarification to your answer, is it ok? – Mystic Odin Oct 18 '16 at 14:34
@MysticOdin: Sure. Note: Even without unusual weird behaviors, `new`/`delete` could simply be using separate heaps. Even if they use the same logic for deallocation, so the code for one works on the other, if the heap involves explicit locking (rather than atomics), they'd be acquiring the lock for one heap, while manipulating the other, potentially causing major problems in threaded code. – ShadowRanger Oct 18 '16 at 14:40
Do you need the lambda in `make_shared_triangle`? Can't you use the function pointer `&destroy_triangle` ? – MSalters Oct 18 '16 at 15:41
@MSalters: Well, given it's runtime dynamically loaded, `destroy_triangle` is already a function pointer, so the `&` wouldn't be used. But yes, that should work. Usually, lambdas (or functor structs) are encouraged because they can be trivially inlined, while function pointers cannot (so using a lambda means a direct call to the function, passing the function itself means a call through pointer, which is slower), but in this case, the function being called is a function pointer, so after lambda inlining, it's still a call through pointer. Using lambdas is mostly a habit for when it matters. – ShadowRanger Oct 18 '16 at 16:41

Hayt · Answer 2 · 2016-10-18T14:00:27.007

2

If you really want to go by the example you linked to you can use a custom function to be used when the smart pointer should delete it's object.

std::shared_ptr<class> object(create_object(), //create pointer
[=](class* ptr)
{
    destroy_object(ptr);
});

With this instead of delete the lambda will be called when the shared pointer should delete itself.

Note: I copied the function pointer to the destroy_object function in the lambda ([=] will do this). As long as you don't call dlclose() when this is used in context of dynamic loading this should be valid. When you use dlclose though this will cause errors.

edited Oct 18 '16 at 14:00

answered Oct 18 '16 at 13:16

Hayt

5,210
30
37

I think you meant `destroy_object(ptr);` – ShadowRanger Oct 18 '16 at 13:19
why would you make a copy of every in-scope element for your deleter lambda? That seems potentially very wasteful.. and invalid if anything visible isn't copyable. – xaxxon Oct 18 '16 at 13:27
@xaxxon yeah this was just me being lazy. I corrected this with just the needed function pointer. – Hayt Oct 18 '16 at 13:34
@xaxxon: In practice, [`[=]` will only capture the stuff it uses](http://stackoverflow.com/a/6181507/364696). Explicit capture might be slightly friendlier to people reading the code, but it's actually more error-prone in some ways; if you name all your captures, and later end up not using some of them, the compiler still has to capture the named captures; if you leave it implicit, the compiler always captures what it needs, no more, no less. – ShadowRanger Oct 18 '16 at 13:58
@ShadowRanger wow did not know that. But makes sense for lambdas. – Hayt Oct 18 '16 at 14:01
@ShadowRanger I strongly disagree with your "safety" comment. If you use = and accidentally shadow a variable name, it will suddenly become a copy. From a correctness perspective, this can be very impactful. It's much easier to have the compiler tell you you are capturing unused variables than it is for it to know if you meant to use something from outside or not when you say you do. – xaxxon Oct 18 '16 at 14:04
@ShadowRanger also, if you check out this example: https://godbolt.org/g/KgRQ3B you can take out the explicit capture of `b` and clearly see that the generated code is identical. – xaxxon Oct 18 '16 at 14:07
@xaxxon: If you actually shadow the name (by declaring a new variable in the lambda with the same name), I don't think it will capture (though I won't swear to it). In any event, I'm not really recommending one or the other; I agree accidental copy that's just overwritten is a problem, and with sufficient warnings enabled, compilers _should_ be able to catch unused capture variables. As for your example, as my link notes, named captures are _supposed_ to be copied unconditionally by the standard; compilers may only be able optimize it out when it can be sure copying is side-effect free. – ShadowRanger Oct 18 '16 at 14:08
this escalated somehow Oo. I don't really care about which style I put in the answer. Both of them have their arguments for/against them. – Hayt Oct 18 '16 at 14:10
What do you think (question is to everybody) about adding `dlclose(handle);` to the body of the lambda in case this is the only object the would be used from this handle? – Mystic Odin Oct 19 '16 at 09:04
@MysticOdin if it is the only instance of it yes it works. If you have multiple instances of the class you need a handle for each instance then. But That is the purpose of RAII. Let destruction take care of cleaning up. – Hayt Oct 19 '16 at 09:08

Adrian Colomitchi · Answer 3 · 2016-10-18T14:22:42.857

0

The background of not wanting to use a destroy function is that I want to use the allocated object in a shared_ptr and not care later about its lifetime, working with a "destroy" function will be tricky, therefore I need to know if it's necessary.

Then you will need to create your shared_ptr using an explicit Deleter (see form 4 of the constructor. Scroll down to examples).

template< class Y, class Deleter > shared_ptr( Y* ptr, Deleter d );

Something like:

shared_ptr<polygon> sh_ptr_val(
                     my_triangle, 
                     [](auto ptr) { destroy_triangle(ptr); }
                   );

[edited to address Hayt first comment]

struct triangle_factory { 
  static shared_ptr<triangle> create() {
    shared_ptr<polygon> ret(
                         create_triangle(), 
                         [](auto ptr) { destroy_triangle(ptr); }
                       );
    return std::move( ret )
  }; 
private: 
   static create_t* create_triangle; 
   static destroy_t* destroy_triangle; 
}

create_t* triangle_factory::create_triangle=(create_t*) dlsym(triangle, "create");
destroy_t* triangle_factory::destroy_triangle=(destroy_t*) dlsym(triangle, "destroy");

edited Oct 18 '16 at 14:22

answered Oct 18 '16 at 13:23

Adrian Colomitchi

3,974
1
14
23

if `destory_triangle` is not global though you won't be able to call it. – Hayt Oct 18 '16 at 13:38
@Hayt I suspect triangle creation/packing in shared_ptr will happen inside a factory function/method. It will be good enough if destroy_triangle will be known inside that place. – Adrian Colomitchi Oct 18 '16 at 14:04
there is an example linked in the question which shows the actual origin of the functions. `destroy_triangle` is a function pointer aqquired by `dlsym(...)` – Hayt Oct 18 '16 at 14:06
@Hayt ...and the same place also creates triangles using a function pointer aqquired by dlsym(...). So that a `struct triangle_factory { static shared_ptr create() { /* do the stuff */}; private: static create_t* create_triangle; static destroy_t* destroy_triangle; }` should be a "global" enough. – Adrian Colomitchi Oct 18 '16 at 14:13

C++ dynamic loading of classes: Why is a "destroy" function needed?

3 Answers3