Design of (shared_ptr + weak_ptr) compatible with raw pointers

Question

Preamble

In C++11 there is std::shared_ptr + std::weak_ptr combo. Despite being very useful, it has a nasty issue: you cannot easily construct shared_ptr from a raw pointer. As a result of this flaw, such smart pointers usually become "viral": people start to completely avoid raw pointers and references, and use exclusively shared_ptr and weak_ptr smart pointers all over the code. Because there is no way to pass a raw reference into a function expecting a smart pointer.

On the other hand, there is boost::intrusive_ptr. It is equivalent to std::shared_ptr and can easily be constructed from raw pointer, because reference counter is contained within the object. Unfortunately, there is no weak_ptr companion to it, so there is no way to have non-owning references which you could check for being invalid. In fact, some believe that weak companion for intrusive_ptr is impossible.

Now, there is std::enable_shared_from_this, which embeds a weak_ptr directly into your class, so that you could construct shared_ptr from pointer to object. But there is small limitation (at least one shared_ptr must exist), and it still does not allow the obvious syntax: std::shared_ptr(pObject).

Also, there is a std::make_shared, which allocates reference counters and the user's object in a single memory chunk. This is very close to the concept of intrusive_ptr, but the user's object can be destroyed independently of the reference counting block. Also, this concept has an inevitable drawback: the whole memory block (which can be large) is deallocated only when all weak_ptr-s are gone.

Question

The main question is: how to create a pair of shared_ptr/weak_ptr, which would have the benefits of both std::shared_ptr/std::weak_ptr and boost::intrusive_ptr?

In particular:

shared_ptr models shared ownership over the object, i.e. the object is destroyed exactly when the last shared_ptr pointing to it is destroyed.
weak_ptr does not model ownership over the object, and it can be used to solve the circular dependency problem.
weak_ptr can be checked for being valid: it is valid when there exists a shared_ptr pointing to the object.
shared_ptr can be constructed from a valid weak_ptr.
weak_ptr can be constructed from a valid raw pointer to the object. Raw pointer is valid if there exists at least one weak_ptr still pointing to that object. Constructing weak_ptr from invalid pointer results in undefined behavior.
The whole smart pointer system should be cast-friendly, like the abovementioned existing systems.

It is OK for being intrusive, i.e. asking the user to inherit once from given base class. Holding the object's memory when the object is already destroyed is also OK. Thread safety is very good to have (unless being too inefficient), but solutions without it are also interesting. It is OK to allocate several chunks of memory per object, though having one memory chunk per object is preferred.

http://en.cppreference.com/w/cpp/memory/shared_ptr/shared_ptr shows many constructors based on raw pointer, the link just points to a gotcha that making more shared_ptr's out of one raw pointer will have many different instances of shared_ptr, all wanting to delete the object when needed. You have to make one shared_ptr and then copy it. It may be a nuisance, but it doesn't make the concept unusable. — stefaanv, Dec 21 '16 at 12:19
@stefaanv: Without a doubt, the `std::shared_ptr` + `std::weak_ptr` are very usable, and I have used them successfully myself. However, this question is intended to call for a better design, which does *not* have this "nuisance". — stgatilov, Dec 21 '16 at 12:25
What's the difference in virality between an intrusive pointer target and a class derived from enable_shared_from_this? they both require knowledge of their lifetime management mechanisms. — Richard Hodges, Dec 21 '16 at 12:29
@stgatilov: I understood it, but I wanted to clarify as reading the question without the link seemed like it could not be done or caused a much greater problem. Indeed, it would be nice to not run in this problem. However, when working shared_ptr, it should be done when life time management is done, so immediately and not at some point when using already a raw pointer. The raw pointer can be retrieved from the shared pointer when needed for passing to functions (not for storing or passing to threads). — stefaanv, Dec 21 '16 at 12:32
If you follow guideline from [gotw-91-solution-smart-pointer-parameters/](https://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/), the virality of smart pointer is limited. — Jarod42, Dec 21 '16 at 12:46
@Jarod42: Having read the link, I must agree that `std::shared_ptr` becomes viral mostly because people make it such. In most cases accepting `const shared_ptr&` in function arguments should work. Raw reference could also work, but it gets viral in its own way: you can never get `shared_ptr` from it again. — stgatilov, Dec 21 '16 at 13:19
"_it can be used to solve the circular dependency problem._" how? — curiousguy, Dec 25 '16 at 01:59
@curiousguy: With a cycle of shared_ptr pointing to each other, memory leak happens when all pointers pointing from outside are destroyed. I meant that if some of the pointers in cycle are weak_ptr, such a problem must not occur (just as with `std::weak_ptr`). — stgatilov, Dec 25 '16 at 05:39
@stgatilov If you have a cycle, well you have a cycle. If you remove with link, you don't have a cycle. So your suggestion is not having a cycle. But weak_ptr doesn't help. It's the not having a cycle that helps. — curiousguy, Dec 26 '16 at 22:07

score 3 · Answer 1 · answered Dec 21 '16 at 12:41

3

Points 1-4 and 6 are already modelled by shared_ptr/weak_ptr.
Point 5 makes no sense. If lifetime is shared, then there is no valid object if a weak_ptr exists but a shared_ptr does not. Any raw pointer would be an invalid pointer. The lifetime of the object has ended. The object is no more.

A weak_ptr does not keep the object alive, it keeps the control block alive. A shared_ptr keeps both the control block and the controlled object alive.

If you don't want to "waste" memory by combining the control block with the controlled object, don't call make_shared.

If you don't want shared_ptr<X> to be passed virally into functions, don't pass it. Pass a reference or const reference to the X. You only need to mention shared_ptr in the argument list if you intend on managing the lifetime in the function. If you simply want to perform operations on what the shared_ptr is pointing at, pass *p or *p.get() and accept a [const] reference.

answered Dec 21 '16 at 12:41

Richard Hodges

68,278
7
90
142

If control block is placed intrusively into the object itself (or is allocated nearby), then it should be possible to get its address from the object pointer. Then it would be possible to construct weak_ptr from raw pointer despite the fact that object is already destroyed. – stgatilov Dec 21 '16 at 12:46
1

@stgatilov I see where you're going, but this would still not satisfy 5. Because if the control block is part of the object, it will cease to exist when the object exists. There will be no possibility of a weak_ptr. The existing shared_ptr/weak_ptr represents the state of the art. It has been proven through many years as part of the boost library. These arguments of intrusive/non-intrusive come up from time to time. We're still using shared_ptr because it works and no better way has been found. – Richard Hodges Dec 21 '16 at 13:39
Well, I'm not a language lawyer indeed. When you destroy the reference counters object, its fields of primitive type are not altered. So even if the reference counters are the part of the object, they can survive object destruction and stay in the still-owned memory chunk. Anyway, if C++ standard explicitly forbids such things, then it is possible to allocate additional space like `std::make_shared` does. Perhaps I should have phrased the question like "intrusive_ptr with weak_ptr" from the very beginning... – stgatilov Dec 21 '16 at 14:02
@stgatilov When you destroy an object, the subobjects are destroyed semantically. Even if there have trivial type and the storage is still there. So you wouldn't be able to access the subobjects after destruction. And you can't have an object residing in the middle of another object. C++ isn't really a low level language. – curiousguy Jan 18 '19 at 20:11
Although you might think that you can get away with semantic UB with putting enough `volatile` 1) it's ugly 2) compilers sadly don't put `volatile` variables in registers so it's inefficient 3) you can't even discuss that here, the mods don't accept that use of `volatile` – curiousguy Jan 18 '19 at 20:21

score 1 · Answer 2 · answered Dec 21 '16 at 14:07

Override new on the object to allocate a control block before the instance of the object.

This is pseudo-intrusive. Conversion to from raw pointer is possible, because of the known offset. The object can be destroyed without a problem.

The reference counting block holds a strong and weak count, and a function object to destroy the object.

Downside: it doesn't work polymorphically very well.

Imagine we have:

struct A {int x;};
struct B {int y;};
struct C:B,A {int z;};

then we allocate a C this way.

C* c = new C{};

and store it in an A*:

A* a = c;

We then pass this to a smart-pointer-to-A. It expects the control block to be immediately before the address a points to, but because B exists before A in the inheritance graph of C, there is an instance of B there instead.

That seems less than ideal.

So we cheat. We again replace new. But it instead registers the pointer value and size with a registry somewhere. There we store the weak/strong pointer counts (etc).

We rely on a linear address space and class layout. When we have a pointer p, we simply look for whose range of address it is in. Then we know the strong/weak counts.

This one has horrible performance in general, especially multi-threaded, and relies upon undefined behavior (pointer comparisons for pointers not pointing to the same object, or less order in such cases).

+1 for actually giving answer to the original question. Overloading `operator new` only allows to use `new T()` syntax instead of `make_shared()`. As for the last idea of storing address range of every class, this is a complete nonsense of course =) — stgatilov, Dec 21 '16 at 16:39
The problem of inheritance is in some way inherent in the idea of intrusive smart pointer. If you make sure that any managed class T inherits once from some empty Base class, then you can most likely statically convert your pointer to `Base*` to get the address of where the object actually starts. — stgatilov, Dec 21 '16 at 16:39
@stgatilov It has to inherit from `Base` as the *first* type (top left) in its heirarchy. And the ABI cannot place anything before that pointer value (like the vtable). It is quite fragile. Storing ranges is, in a sense, less nonsense; you control allocation. If memory is linear and flat (with `char*`s), you are good. — Yakk - Adam Nevraumont, Dec 21 '16 at 19:04

score 0 · Answer 3 · edited May 23 '17 at 11:53

In theory, it is possible to implement intrusive version of shared_ptr and weak_ptr, but it might be unsafe due to C++ language limitations.

Two reference counters (strong and weak) are stored in the base class RefCounters of the managed object. Any smart pointer (either shared or weak) contains a single pointer to the managed object. Shared pointers own the object itself, and shared + weak pointers together own the memory block of the object. So when the last shared pointer is gone, object is destroyed, but its memory block remains alive as long as there is at least one weak pointer to it. Casting pointers works as expected, given that all the involved types are still inherited from the RefCounted class.

Unfortunately, in C++ it is usually forbidden to work with members of object after the object is destroyed, although most implementations should allow doing that without problems. More details about legibility of the approach can be found in this question.

Here is the base class required for the smart pointers to work:

struct RefCounters {
    size_t strong_cnt;
    size_t weak_cnt;
};
struct RefCounted : public RefCounters {
    virtual ~RefCounted() {}
};

Here is a part of shared pointer definition (shows how object is destroyed and memory chunk is deallocated):

template<class T> class SharedPtr {
    static_assert(std::is_base_of<RefCounted, T>::value);
    T *ptr;

    RefCounters *Counter() const {
        RefCounters *base = ptr;
        return base;
    }
    void DestroyObject() {
        ptr->~T();
    }
    void DeallocateMemory() {
        RefCounted *base = ptr;
        operator delete(base);
    }

public:
    ~SharedPtr() {
        if (ptr) {
            if (--Counter()->strong_cnt == 0) {
                DestroyObject();
                if (Counter()->weak_cnt == 0)
                    DeallocateMemory();
            }
        }
    }
    ...
};

Full code with sample is available here.

Design of (shared_ptr + weak_ptr) compatible with raw pointers

Preamble

Question

3 Answers3

Linked