Using std::weak_ptr after DLL unload

Question

I have a problem with using std::weak_ptr to an object from another DLL after it was unloaded.
Here is the simplified situation: there is an application that loads 2 DLLs. DLLs provide some objects, that inherit common interface. The application keeps a kind of cache of these objects (a std::vector of std::shared_ptrs), so that DLLs may use each other objects indirectly.
When DLL unloads it removes it's objects from cache.

But the problem comes if another DLL keeps a weak_ptr to another DLL's object, because when the destructor of thar weak_ptr is called, it tries to delete the control block, and that causes an AV error, since the owner DLL is already unloaded.
I though I may enforce control block to be allocated in the application, instead of DLL, but I could not figure out how to do it. I've tried creating an Allocator on the application side and passing it to shared_ptr constructor, but the control block still keeps some pointer to DLL's classes which, again, cause AV errors.

The only thing that seems to work is to create raw pointers to objects in DLLs and then pass them to the application, which than wraps them in shared_ptr. But that leads to some problems with enable_shared_from_this.

So, my question is - what is the best thing to do?

Avoid stl stuff that is known to cause issues inter-library. Pass raw pointers between dlls. — Michael Chourdakis, Nov 05 '21 at 19:20
Yeah, but that won't solve problem with dangling pointers. All libraries are build at the same time and with the same toolchain, so there is no real problem apart from what I've described. — Yog.Muskrat, Nov 05 '21 at 19:47
Yes, even with the same compiler version there are issues. See also my question https://stackoverflow.com/questions/65382150/run-time-implementation-of-stdfunction — Michael Chourdakis, Nov 05 '21 at 21:29
Does it work if you pass a custom deleter, not just allocator? With what exact traceback do you get a failure - specifically, what unloaded function or object does it try to access? — o11c, Nov 05 '21 at 23:08
That said, there is a reason that most modern libcs decided not to implement `dlclose`. — o11c, Nov 05 '21 at 23:09
Hmm, another potential problem is if a weak symbol gets defined in multiple DLLs ... I have no idea what happens in that case. — o11c, Nov 05 '21 at 23:26
Yeah, I've tried providing custom deleter, but to no success. The problem is - even though the control block memory is allocated on the application side via the custom allocator, it still keeps pointers to DLLs methods, that are no longer available after it is unloaded. — Yog.Muskrat, Nov 06 '21 at 09:12

Yakk - Adam Nevraumont · Accepted Answer · 2021-11-06T06:04:29.417

I have solved variations of this.

In general, you should not keep pointers to data allocated in a dll around past the lifetime of that dll.

But you can solve this specific problem. Basically, replace calls to make sharedand shared ptr from raw pointer creation in your program. The easiest method may be to ban shared ptr entirely, and write a wrapping subclass that does the redirect.

Then, make a "mysharedptr" dll. It offers a two functions; dll safe make shared and dll safe shared ptr creation (from a ptr).

The from ptr is easy. Header has a template function, which calls a create void shared and passes in ptr to deletion function. The dll that creates that deletion function has to persist long enough; see below.

It then uses the aliasing constructor to return a shared ptr to T with the void pointer control block.

For the make shared replacement, write a dll safe functikn one that make shares a buffer of various fixed sizes. Like powers of two. It also holds a function poonter that destroys the contents. Now in the template, call the fixed buffer make shared, the placement construct in the buffer, then install a pointer to a destruction function (in that order).

Finally, to make the deleter and destruction functions dll safe, use ADL and tag dispatching.

using cleanup=void(*)(void*);
template<class T>struct tag_t{};
cleanup dll_safe_delete_for(tag_t<Bob>);
cleanup dll_safe_destroy_for(tag_t<Bob>);

those two functions exported by the dlls the types come from and in the namespaces of said types. The dll safe shared ptr finds them via tag dispatching.

std::shared_ptr<void> safe_void_shared( void*, void(*)(void*) ); // export from "safe shared ptr util" dll
// API for dll safe shared ptrs:
template<class T>
std::shared_ptr<T> safe_shared( T* t ) {
  auto pvoid = safe_void_shared( t, dll_safe_delete_for(tag_t<T>{}) );
  return std::shared_ptr<T>( std::move(pvoid), t ); // aliasing ctor
}

then inside the safe shared ptr dll:

std::shared_ptr<void> safe_void_shared( void* ptr, void(*dtor)(void*) ){
  return {ptr, dtor};
}

For a type to work, they need to do this:

namespace FooNS{
  struct some_type {/*blah*/};
  cleanup dll_safe_delete_for(tag_t<some_type>);// exported from dll
}
// in cpp in dll for some_type
cleanup FooNS::dll_safe_delete_for(tag_t<some_type>){
  return [](void* pvoid){if(pvoid) delete static_cast<some_type*>(pvoid);};
}

and done. Users just:

auto ptr=safe_shared<FooNS::some_type>( pSomeType );

and the dtor code lives in the some_type dll, while the control block code lives in the dll safe shared dll (different dlls). So your weak ptr can outlive the some_type dll.

Similarly for make shared support, you have a beader file with glue, a void based dll safe function that ensures the control block code is in a safe dll, and some fancy footwork to get a pointer to the destructor out of the class's personal dll.

// exported from dll
// creates an approx bytes sized buffer using make_shared, then emplaces and installs dtor and reutns pointer at object
std::shared_ptr<void> emplace_shared_ptr( std::size_t bytes, std::function<void*(void*)> ctor, void(* dtor)(void*) );

template<class T, class...Args>
std::shared_ptr<T> make_dll_safe_shared( Args&&...args ){
  auto pvoid = emplace_shared_ptr(
    sizeof(T),
    [&](void* here){ return ::new(here) T(std::forward<Args>(args)...); },
    dll_safe_destroy_for(tag_t<T>{})
  );
  return std::shared_ptr<T>(std::move(pvoid), static_cast<T*>(pvoid.get()) );
}

Now the emplace void is a bit tricky.

template<std::size_t sz>
struct buffer{
  std::array<char, sz> data;
  void(*dtor)(void*)=nullptr;
  ~buffer(){ if (dtor) dtor(data.data()) }
};
trmplate<std::size_t...mags>
std::shared_ptr<void> emplace_shared_ptr_impl( std::index_sequence<Is...>, std::size_t magnitude, std::function<void*(void*)> ctor, void(* dtor)(void*) ){
  using factory=std::shared_ptr<void>(*)( std::function<void*(void*)>, void(*)(void*) );
  static factory factories[]={
    [](std::function<void*(void*)> ctor, void(*dtor)(void*))->std::shared_ptr<void>{
      auto pbuff=std::make_shared<buffer<1<<mags>();
      void* pvoid=ctor(pbuff->data.data());
      if(!pvoid)return{};
      pbuff->dtor=dtor;
      return std::shared_ptr<void>( std::move(pbudd), pvoid );
    }...
  };
  return factories[magnitude](ctor, dtor);
}
  
std::shared_ptr<void> emplace_shared_ptr( std::size_t bytes, std::function<void*(void*)> ctor, void(* dtor)(void*) ){
  return emplace_shared_ptr_impl(std::make_index_sequence<40>{}, pow_of_2_at_least_as_big_as(bytes), ctor, dtor);
}

Lots of typoes, efficiency tweaks and error checking. But that is it. 1 terabyte max object size for the make code (1<<40).

The dtor/delete code lives in dll of the type. It is fetched via the dll_safe_destriy_for functions you are responsible for writing for each type you want to support.

The control block code lives in a different dll; namely, a special one that implements those shared ptr void stuff above. It needs to outlive your weak ptrs.

I have used a variation of this (in my case, it was because DLL B was wrapping objects from DLL A in shared ptrs (in template code, so didn't even know shose classesit was); DLL B was unloaded before A, and some shared ptrs that B made outlived it. Boom. The same trick above moved the actual shared ptr creation back into A. This case just has to the trick twice, as we need the dtor to live in A, and the control block code to outlive A.

@einp sample code added. It certainly has typos. I advise getting the `safe_shared` working and understood before you even try reading `emplace_safe_shared` one. — Yakk - Adam Nevraumont, Nov 06 '21 at 05:57
Thank you, Adam for such a detailed answer! Am I correct that `shared_ptr`s aliasing ctor is the cornerstone here? That's a great idea (and the first time I understand the real use of this feature). — Yog.Muskrat, Nov 06 '21 at 09:23
@yog.musk Yes. The aliasing ctor reuses a control block. So we ensure that the control block is created in specific DLL, which means a weak ptr desctruction can occur so long as that specific DLL exists. The tag dispatching part allows a nice typesafe header only API and means supported types require code only local to them. If you want to support a type whose namespace you cannot invade, use the namespace of the dll safe shared ptr system. Otherwise, just put the dll safe free functions in the type you want to support's namespace. — Yakk - Adam Nevraumont, Nov 06 '21 at 17:35

einpoklum · Answer 2 · 2021-11-05T23:30:21.490

1

_{(I can't quite understand Adam's suggestion; it's possible his solution is great... anyway, here is what I think:)}

You are mixing up incompatible destruction schemes.

Pointers to functions in a DLL become invalid when you unload the DLL. That means it is senseless to put those pointers in an std::shared_ptr: Even if nobody is "using" a pointer to some function in the DLL, it might still be in use via other functions. Plus, the decision of when to unload it may be more complex than "nobody is currently holding pointers to functions from it". And nota bene: It doesn't matter what code you use for a shared pointer deleter - it is simply not the right abstraction.

Since std::shared_ptr is off the table, so is std::weak_ptr - which only points to a pointer managed by an std::shared_ptr.

What you really need, IMHO, is a library for working with DLLs, which allow you to dole out a different kind of weak pointers, that track whether the DLL is still loaded or not.

edited Nov 05 '21 at 23:30

answered Nov 05 '21 at 23:15

einpoklum

118,144
57
340
684

Yes, the problem is that control block still keeps pointers to the unloaded DLL's functions. From what I undestand, these functions aren't supposed to be called if the object itself is already destroyed (that is, `shared_ptr` counter is 0 and only a `weak_ptr` exists). But they somehow still do. – Yog.Muskrat Nov 06 '21 at 09:19
Is there a legit way to do that? From what I know, control block's internals are implementation specific and not publicly available. – Yog.Muskrat Nov 06 '21 at 09:49
1

no, the control block is a feature of how shared/weak ptr is implemented. It is where the reference counts go. @yog.musk the issue is the dtor code of the control block is a from a template class, so it is generated in the dll where you make the shared ptr. When that dll goes away... – Yakk - Adam Nevraumont Nov 06 '21 at 17:40
1

@Yog.Muskrat: Ah, ok, now I understand your comment! Well, you won't have any control blocks, since you absolutely should not use `std::shared_ptr`'s for functions. If you do want to do reference counting, you would need to have _all_ of your references use the _same_ reference count for the entire DLL. So, you could have a shared_ptr to a loaded DLL class, and only save function pointers in that; or you could save function pointers elsewhere, but then, the loaded DLL class must have a reference count for all of the function pointers obtained from it. – einpoklum Nov 06 '21 at 17:45

Using std::weak_ptr after DLL unload

2 Answers2

You are mixing up incompatible destruction schemes.