1

I use boost::intrusive_ptr a lot to keep instances of certain classes alive. At some point my program expects all boost::intrusive_ptrs to have been deleted, so that the under laying object is released.

A problem that I seem to frequently run into lately is that not all boost::intrusive_ptr are deleted and said object is still "alive". This is a bug.

I need to be able to detect where those pointers are. Getting a list of the source-file/line numbers of where they were created is good enough (for example, if if the debug output tells me that a pointer is still alive that was created here:

foo.m_ptr = foo::create<Foo>();

where m_ptr is a boost::intrusive_ptr<Foo> and/or boost::intrusive<Foo const> then I can figure out that the pointer that still wasn't destructed is Foo::m_ptr ;).

Of course, in this case, m_ptr was already created before - maybe even assigned. So what is important here is not to track the constructor of boost::intrusive_ptr but rather the place where the reference count is (last) incremented.

The above assignment function looks as follows:

intrusive_ptr<T>& operator=(intrusive_ptr<T> const& rhs)
{
    intrusive_ptr<T>(rhs).swap(*this);
    return *this;
}

Hence, this copy-constructs a temporary from rhs, incrementing the reference count of the object that rhs points to, then swaps that with *this so that the temporary now points to what *this pointed to before and the current object points to what rhs points to. Finally the temporary is destructed causing a decrement on the ref count of what *this pointed to.

Getting the call address from the intrusive_ptr_add_ref would hence require unwinding the stack twice (if not three times), possibly depending on optimization level.

As such I think it cannot be avoided to replace boost::intrusive_ptr with my own class, lets say utils::intrusive_ptr, which would implement the above function as follows:

[[gnu::noinline]] intrusive_ptr<T>& operator=(intrusive_ptr<T> const& rhs)
{
    intrusive_ptr<T>(rhs, __builtin_return_address(0)).swap(*this);
    return *this;
}

which uses [[gnu::noinline]] to be sure that __builtin_return_address(0) returns the return address of the current function. This return address is passed to the constructor to register it (the return address can easily be converted to to call location later on, ie when printing the list of still existing pointers).

At the same time it is required to remove the location that was stored for rhs. It seems to me that the best way to make sure of this is to literally store the location in the intrusive_ptr itself. This location then would also be swapped by swap and thus destructed when leaving this scope.

Registration of existing intrusive_ptrs can be done by their address: every constructor has an address (its this pointer) that will be unique and can be used as a key in a map. Upon destruction this key is simply removed again.

This way, using the map, there is a list of all existing intrusive_ptr objects, which store their own location.

Is my reasoning correct here? Or is there another way to do this?

EDIT:

I added a new class that is basically a copy of boost::intrusive_ptr, but stripped of anything boost and that only supports c++17 (and up). I didn't compile it yet, but this should serve as the basis(?). See https://github.com/CarloWood/ai-utils/blob/master/intrusive_ptr.h

Carlo Wood
  • 5,648
  • 2
  • 35
  • 47
  • For debugging purposes, I've used instrumentation to add `this` to a global `std::set` in the constructor (of a tracking class, then added to the struct/class as a member variable), and delete `this` in the destructor. Then check the set to see if they are all deleted when they ought to be. That could work independently and in conjunction with Boost's `intrusive_ptr`. – Eljay May 01 '21 at 19:33
  • This sounds like a system with a lot of complexity, and problems due to the high complexity, and an attempt to solve those problems with even more complexity. Going directly for the interesting technical part is often not the best solution (but very common). My advice is this: Try instead to take a step back and consider if the overall design could be simpler, in a way where the problems would not occur. E.g. could you restructure it so the amount of places the pointers exists is small enough to be tracked and cleared easily? – olm May 01 '21 at 20:11
  • @olm No. ;) ... Yes the program is extremely complex, or rather, the library that I am writing. The objective of that complexity is to enable the user to do (otherwise) very complex things in a relative simple way. Aka, all complexity is hidden in the library. I can't "do a step back" as this concerns a project that has been going for more than 10 years now. The complexity is intended, too. – Carlo Wood May 01 '21 at 20:20
  • @Eljay Such tracking already exists. The hard to track objects are 'Devices' (file, socket etc) and the library refuses to exit in an unclean manner (unless by assertion); aka, the main thread joins with all threads before exiting. One of the threads it joins with is the EventThread which won't exit until ALL Devices are done (output buffers empty and input devices closed). Now, I know that some device is still alive because that is the very reason that the EventThread doesn't exit and it prints (as debug output) why. But what I do not know is which smart pointer is still pointing to it. – Carlo Wood May 01 '21 at 20:26
  • @Eljay That is, I know the address of the Device base class and the filedescriptor it uses: the reason the EventThread can't exit is because that fd is still being monitored by epoll: it is in a kernel structure thus. It HAS to be removed. I can remove it forcefully (from epoll) at the moment I insist to exit, but that is hardly clean: other threads might still be running that access it. The clean way is to have the user close the device prior to exiting the application. Anyways, this seems off topic. The point is that what you describe lists the objects that are still alive, not the pointers – Carlo Wood May 01 '21 at 20:31
  • that are still pointing to them. Correct me if I am wrong. – Carlo Wood May 01 '21 at 20:31

0 Answers0