6

I recently hit an issue where neither unique_ptr nor shared_ptr seemed like the right solution. So, I am considering inventing another kind of smart ptr (described below), but I thought to myself "surely I am not the first to want this."

So my high-level questions are:

  • Does the below design make sense?
  • Is there some way to accomplish this with existing smart ptrs (or other std:: features), perhaps I am missing something?

Requirements:

  • I want single ownership, much like unique_ptr
    • That is: only when the single owning pointer dies, should the underlying object be freed (unlike shared_ptr's behavior).
  • I want some additional way to reference the object which is "aware" when the object gets deleted. So, something like weak_ptr, but to be used with a single ownership model.
  • I do not need thread safety

Motivating example:

Suppose I am iterating a list of interface pointers, calling methods on them. Some of those methods may result in items later in the list being deleted.

With plain pointers, I would get dangling references for those deleted items.

Proposed design:

Let's call the owning pointer my_ptr and the non-owning reference my_weak_ptr.

For a given object, we might have a diagram like this:

                             _______
my_ptr<Obj> owner ---------> |Obj* | -------> [Obj data ... ]
                      +----> |count|
                      | +--> |_____|
my_weak_ptr<Obj> A ---+ |
                        |
my_weak_ptr<Obj> B -----+

my_ptr would have an interface largely identical to unique_ptr. Internally, it would store a pointer to a "control block" which is really just the "real" pointer and a refcount for the control block itself. On destruction, my_ptr would set the control block pointer to NULL and decrement the refcount (and delete the control block, if appropriate).

my_weak_ptr would be copyable, and have some get() method which would return the real Obj*. The user would be responsible for checking this for NULL before using it. On destruction, my_weak_ptr would decrement the count (and delete the control block, if appropriate).

The downside is doing two hops through memory for each access. For my_ptr, this could be mitigated by storing the true Obj* internally as well, but the my_weak_ptr references will always have to pay that double-hop cost.


Edit: Some related questions, from links given:

So it seems like there is demand for something like this, but no slam-dunk solutions. If thread safety is needed, shared_ptr and weak_ptr are the right choices, but if not, they add unnecessary overhead.

There is also boost::local_scoped_ptr, but it is still a shared ownership model; I'd rather prevent copies of the owning pointer, like unique_ptr.

jwd
  • 10,837
  • 3
  • 43
  • 67
  • 4
    How would you implement `weak_ptr::lock` for your class? Why not just use `shared_ptr` and `weak_ptr`? What is the benefit of reinventing that wheel? – Werner Henze May 12 '20 at 17:26
  • 1
    @WernerHenze If you don't need thread safety, then you don't need `lock`. You just check if the object is alive, then use it before anything else can happen. – Brian Bi May 12 '20 at 17:28
  • @WernerHenze There would be no `lock` (that is: no upgrade to `shared_ptr`). There would just be `get()` (get the raw pointer, which might be null). I would like to avoid shared_ptr because I do not want shared ownership. I want exactly one owner to be able to control the lifetime. Also, I want a lot of these, and there is a little extra overhead to `shared_ptr`. – jwd May 12 '20 at 17:29
  • 2
    Could you not just use normal pointers to `my_ptr` and query it to see if it still points to anything before using it? – Galik May 12 '20 at 17:37
  • 1
    the `get()` is pretty bad because c++ is concurrent, and right after you do `get()` the object may be destroyed. You shouldn't access a pointer you don't own (even temporarily) in c++ – pqnet May 12 '20 at 17:38
  • @pqnet: re "C++ is concurrent" — I explicitly don't need thread safety, as I noted in the question. C++ doesn't magically do concurrency everywhere, you choose when to use it and when not to. – jwd May 12 '20 at 17:41
  • @Galik: `my_ptr` can be destroyed while there are still outstanding `my_weak_ptr`s. That's the case I was trying to get at in the motivating example. If I understand you correctly, a raw ptr to `my_ptr` would be a dangling reference in that case. – jwd May 12 '20 at 17:43
  • so basically you want a `strong_ptr` that can't be copied (but can be moved), and a `weak_ptr` that can be copied and when you `lock` it gives you a `fleeting_ptr` that is same as a `strong_ptr` except you can't copy nor move right? – pqnet May 12 '20 at 17:50
  • @jwd Without shared ownership, how can you ever use the pointer `get` returns? The object can disappear at any time since the code that called `get` doesn't share ownership. Only code with some kind of ownership of an object can assume it won't disappear. It seems like you *do* want shared ownership. (Anything another thread can do can, in principle, be done by a function a single thread calls.) – David Schwartz May 12 '20 at 17:51
  • @jwd: I think [this question](https://stackoverflow.com/q/59632415/734069) covers some of this. – Nicol Bolas May 12 '20 at 18:13
  • 1
    @DavidSchwartz: I'm not sure what you mean by "disappear at any time". This is synchronous code. I think you might be referring to thread-safety? That is a requirement I explicitly ruled out. – jwd May 12 '20 at 19:01
  • @pqnet: Yes, I think that is about right. Though I had imagined that `fleeting_ptr` would just be a raw ptr that the user needs to (1) check for null and (2) be careful not to keep around. Though I do like the explicit naming and enforcement of that concept that you suggest. – jwd May 12 '20 at 19:04
  • it is hard to ensure that the object is still valid unless your `fleeting_ptr` has a strong reference. The problem, even if it is not multithread, may come from a function you call that may destroy the `strong_ptr` as a side effect. I would suggest the implementation for both `strong_ptr` and `fleeting_ptr` to be actually the same as `shared_ptr` except that the static typing limit their ability to be copied and moved (it is problematic though to return an object which can't be copied nor moved...) – pqnet May 12 '20 at 19:30
  • @pqnet re: "it is problematic though to return an object which can't be copied nor moved..." — yes, I was bitten by this recently elsewhere (: You can do it in C++17 with guaranteed RVO, but I'm on C++14 for now. Regarding you other point: actually, I *want* the owning pointer to be able to delete exactly when it wants to. I explicitly do *not* want another reference to be able to extend its lifetime, even though it means code around those other references needs to be careful about it disappearing due to a foreign func. call, etc, as you mention. – jwd May 12 '20 at 19:42
  • 3
    @jwd Anything another thread could do, a function that you call from your thread could do. If you only call functions that cannot result in the object being destroyed, then so long as your code is running, you are extending its life. That's the definition of shared ownership, which you say you don't want. But you actually *do* want shared ownership. You need a guarantee the object won't go away for as long as you are using the pointer returned by `get`. However you get that guarantee, that is shared ownership. So just use `shared_ptr`. – David Schwartz May 12 '20 at 19:56
  • @DavidSchwartz: I disagree with your conclusion, though I think your point is generally a good one. You wrote "You need a guarantee the object won't go away for as long as you are using the pointer returned by `get`" — that's true, and that guarantee is provided by me understanding my code, and not calling dangerous functions. I do not need to rely on a `shared_ptr`-style mechanism (and the cost it entails) to provide that guarantee. So I do want a conceptual shared ownership, but only in a very limited sense, where it is momentary and highly-controlled (like @pqnet's `fleeting_ptr` idea). – jwd May 12 '20 at 20:50
  • @DavidSchwartz: small follow-up: a more abstract reason I don't like `shared_ptr` for this case is that such a thing generally signifies that multiple bits of code, at various points, might potentially free the underlying object. Because the exact moment of object death is hard to pin down, `shared_ptr` usage is harder to reason about. It is true that I could use a `shared_ptr` as an implementation detail, and document that it should not be copied, but that is asking for trouble in its own way. – jwd May 12 '20 at 20:55
  • @jwd: "*and the cost it entails*" But here's the thing: you're already paying virtually all of those costs anyway. To have weak ownership, you need to have a control block distinct from the object being managed. Just like `shared_ptr`. It has to have a reference count, just like `shared_ptr`. The only costs you save that I can identify are the cost of copying a `shared_ptr` (which you can just choose not to do) and the cost of bumping the reference count upon "locking" a live weak pointer. Everything else is 100% identical to the average `shared_ptr` implementation with a limited interface. – Nicol Bolas May 12 '20 at 21:19
  • @jwd: "*that is asking for trouble in its own way*" But you already seem to have decided to rely on users not doing the wrong thing. You said that, when you reference a weak pointer, you *won't* do anything to cause it to be destroyed. That requires documentation and trust between disparate bits of code. If you can trust them in one respect, why not in another? – Nicol Bolas May 12 '20 at 21:22
  • @NicolBolas: Re: cost: I think weak+shared each have separate ref counts, whereas I only need one? Also, thread safety has a price (though I stumbled upon `boost::local_shared_ptr` which addresses that; added note in the question). Re: trust: Agreed that both approaches require trust/docs; maybe it's shades of grey, but I feel like idiomatic usage of shared_ptr is to do the "wrong" thing in this case (namely: "feel free to copy"), whereas idiomatic usage of a raw pointer is to do the "right" thing (namely: "you better watch out"). – jwd May 12 '20 at 23:38
  • @jwd Your abstract reason makes no sense. We already have accepted that you must know and ensure that the object's lifetime extends past any time you might use it. So how does that make any difference? You say those various bits of code might potentially free the underlying object but how can that possibly be right? We already agreed that you had to make sure that other code ensures the object's life extends through all those other uses regardless of what pointer mechanism you use. Either way, unless your code ensures the object stays alive through all that code, you're screwed. – David Schwartz May 13 '20 at 04:19
  • @DavidSchwartz: Sorry, I was making a more general point about `shared_ptr`, not purely when applied to this specific problem. Generally, if I see a `shared_ptr` in some code I think to myself "it is hard to know when the underlying object might be deleted; I will need to study this system in order to understand the lifetime" (or thereabouts). With `unique_ptr` (or something with similar semantics), it is simpler. So my reasoning is: why use `shared_ptr` when the extra capabilities it provides (1) I don't want and (2) makes code harder to reason about for an outsider? – jwd May 13 '20 at 05:50

2 Answers2

4

There was some good discussion in the comments above, so I'll try to answer my own question and summarize:

First, there is an overall downside to the whole concept: Any user of my_weak_ptr needs be very careful not to call some function which could result in the underlying object being deleted. Or if they do, they need to re-check the weak ptr for nullness. This is an unenforced (and unenforceable) constraint placed on the user, the same as if they were using raw pointers.

That being said: this is not new territory. In subsequent research, I have found various incarnations of such an idea:

  • VISH StrongPtr/WeakPtr
    • A pretty good fit. Note that WeakPtr has no lock() method, and the docs say "Weak pointers become magically null if the referred object is destroyed from elsewhere."
    • It is, however, still copyable, so does not express unique ownership.
  • Loki StrongPtr
    • With appropriate use of the dizzying choice of policies (or maybe a custom one), I think what I described can be accomplished.
  • trackable_ptr
    • A different approach, since your T must be wrapped as trackable<T>, but similar problem being solved.

There are also some "pretty good but not quite ideal" solutions closer to std:

  • Use shared_ptr and weak_ptr.
    • Downsides: thread safety overhead, copyable owner ptr, multiple ref counts.
  • boost::local_shared_ptr, which is compatible with weak_ptr.
    • Downsides: copyable owner ptr, multiple ref counts.

Probably local_shared_ptr is the best out-of-the-box solution, with high quality and few downsides.

However, to really squeeze out the last few bytes, and to disallow copying, a custom solution would be needed.

Aside, more philosophically:

I get the sense, both from discussion here and other reading, that many believe in a binary approach toward ownership: either it is shared (so use shared_ptr, which also gives you shared observation via weak_ptr) or it is unique (so use unique_ptr).

That probably covers a good 90%+ of cases. Yet I want unique ownership with shared observation (that's my phrasing; you might use different words depending on your semantics). Probably too corner-case to be covered by the standard, but I think it seems like a reasonable niche, for resource-constrained systems.

jwd
  • 10,837
  • 3
  • 43
  • 67
1

There is a design which does not allocate extra block, but instead it makes pointer object sized as 3 pointers. A pointer is a node of double-linked list, each weak reference is a node of the same list.

Drawbacks are linear deletion complexity (must nullify each reference), and infeasibility making this efficiently thread safe.

Advantage is fast dereference of both shared and weak pointers.

I don't recall where exactly I've seen or heard this idea...

Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
  • Ah, I think this is "reference linking", which I saw in my research, but didn't understand — thank you for explaining it (: ([example link](https://github.com/MIPT-ILab/mipt-mips/wiki/Smart-Pointers-overview)) – jwd May 13 '20 at 06:18