9

I have two std containers. Both of them with pointers to the same data structure. The first containing all the data and the second one containing only some of the same data. Should I use shared_ptr or weak_ptr on the second container?

Firstly, when I read the reference I thought about using unique_ptr on the first collection. My first collection contains all the data and it's the only one that "owns". Meaning that if the data is not there it should be deleted. But then when I try to create the second collection I didn't know what to do. I created a unique pointer but now I need another pointer to the same element destroying uniqueness, but in fact the real owner was not the new pointer. So I understood (I hope I'm not wrong) that the uniqueness is on the way of reaching elements and not (e.g.) the possibility of delete it. So, shared_ptr. I have them on my first collection. But now the second one comes up, I thought to use shared_ptr here too. The ways to access the same data could be both, so the owners are two. But in my case the data is always deleted from the second before. And if I use a weak_ptr , the number of owners will not increase. In both cases the element will be deleted when the first collection wants. At the end I'm using shared_ptr because with weak_ptr I need to lock() every pointer in every line of code, making it less readable. But what should I really use?

Umuril Lyerood
  • 175
  • 1
  • 1
  • 9
  • 5
    `unique_ptr` for the first, `T*` (or something like [`std::experimental::observer_ptr`](http://en.cppreference.com/w/cpp/experimental/observer_ptr)) for the second. – T.C. Aug 03 '15 at 01:39
  • @T.C. : `observer_ptr`... I'm not fundamentally against this but we are making C++ looking like java more and more. I'm not sure we should push this so far. But it's another topic. It would be nice to recenter each fundamental feature into one thing that does it well. In what way an observer is different than a weak ? is it a midway between sheer raw pointer and weak ? is it so useful ? – v.oddou Aug 03 '15 at 01:59
  • @v.oddou a `weak_ptr` requires there to also be a `shared_ptr`, and you can use it to claim shared ownership. An `observer_ptr` on the other hand does not require a `shared_ptr` and you can't use it to claim shared ownership. It is useful to make it clear you are trying to express a _non-owning_ pointer however in my mind a raw pointer should already express a non-owning pointer and `observer_ptr` shouldn't be needed. – Chris Drew Aug 03 '15 at 02:31
  • 1
    @ChrisDrew OK so `observer_ptr` is basically a raw pointer wrapper that acts like a "native documention" ? so no feature like assertion on dead meat dereferencing and stuff ? – v.oddou Aug 03 '15 at 02:35
  • 1
    @v.oddou Yes, "The world's dumbest smart pointer". – Chris Drew Aug 03 '15 at 02:41

4 Answers4

11

It doesn't sound like you need std::shared_ptr because your data is owned in one place.

I would recommend using std::unique_ptr in the owning container and then simply put the raw pointers in the second and subsequent containers.

This works because you will never delete the raw pointers but the data they point to is still managed by a smart pointer and so will be released when you no longer need it.

Despite some bad press, raw pointers are perfectly respectable when used as non owning accessors to data that is owned by some other entity that will delete it at the appropriate time.

Galik
  • 47,303
  • 4
  • 80
  • 117
  • 1
    The only caveat I would add about the use of unique_ptr<> is you can't use forward references like you can with shared_ptr<>. For unique_ptr<>, the deleter is part of the type declaration. – jbruni Aug 03 '15 at 05:15
  • 4
    @jbruni Well you can use a `std::uique_ptr` with an incomplete type as long as you avoid certain operations. The same is true of `std::shared_ptr` only for different operations. http://stackoverflow.com/a/6089065/3807729 – Galik Aug 03 '15 at 05:54
5

You haven't given a critical piece of information, which is whether you can guarantee that the two collections have the same lifetime or not. If you can guarantee that both collections have the same lifetime, then using unique_ptr for the collection that owns everything and raw pointers for the other (as @Galik suggests) is ideal.

If you cannot guarantee that the two lifetimes match, then whether you select shared_ptr's for both or shared_ptr for the first and weak for the second depends on when you want the objects to be destroyed. It sounds like only the first collection is a true owner, so you'd want weak pointers.

However, I'd heavily recommend that you stick with the first approach. It's much cleaner to avoid shared_ptr and weak_ptr. The danger is that if your two collections have different lifetimes, the first collection can be destroyed before the second (just a single misplaced brace), and then when the second collection tries to access, it has dangling pointers. You can of course simply be careful with your variables, but guaranteeing that two independent local variables consistently have the same lifetime is surprisingly easy to mess up.

If you make both collections elements of the same class, you guarantee that they get constructed and destructed at the same time. The exact details of what that class should look like and which code should go where depends on the details of your problem, of course. But even something as simple (albeit bizarre) as making them both the only members (and public) of the same struct is better than using two local variables.

Nir Friedman
  • 17,108
  • 2
  • 44
  • 72
  • 4
    weak_ptr does not result in dangling pointers. In fact it exists precisely to avoid dangling pointers when someone else owns an object you want to refer to. – Marcelo Cantos Aug 03 '15 at 04:33
  • @Nir Friedman The lifetime of the second collection is always shorter than the first one. The second can only have elements that are already present on first. And in the same way, the first can't delete an object that's used on the second because the second should have an object that "doesn't exist" to me. So, in my case, the lifetime of second is "inside" of the first. But I appreciate your answer for further people habving same problema. – Umuril Lyerood Aug 03 '15 at 10:25
  • @Marcelo Cantos I'm aware, I was referring to the first approach, that is unique + raw, the raw pointers can dangle if unique lifetime is shorter. Should I edit to clarify? – Nir Friedman Aug 03 '15 at 12:49
  • @Umuril Lyerood if that's the case then definitely unique + raw is the way to go. – Nir Friedman Aug 03 '15 at 12:50
  • A pointer that points to something that's already been destroyed. – Nir Friedman Aug 04 '15 at 03:55
  • I was referring to unique + raw solution having the potential dangling issue. – Nir Friedman Aug 04 '15 at 13:31
4

What do you mean by "every line of code"? The usual pattern is as follows:

if (auto p = wp.lock()) {
    // p is a shared_ptr; use it as often as you need within the block.
}

If you have an owning container and a referencing container that works correctly if the owner destroys the referee, then shared_ptr/weak_ptr is the way to go.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
  • darn, reintroducing defining variable in conditions... you get no warning from the compiler doing this ? – v.oddou Aug 03 '15 at 02:00
  • 3
    @v.oddou What do you mean? – Marcelo Cantos Aug 03 '15 at 02:01
  • 1
    https://msdn.microsoft.com/en-us/library/7hw7c1he(v=vs.100).aspx and http://stackoverflow.com/q/17681535/893406 – v.oddou Aug 03 '15 at 02:33
  • @v.oddou I think declaring variable in condition is ok. I got no warning with level 4 and visual studio 2012. I tested `if (int x = 1)` and `if (auto x = 1)` – Marson Mao Aug 03 '15 at 02:50
  • 4
    @v.oddou Mixing declaration, initialization and test is a common idiom. It's not the same as assignment and test, e.g.: `if (a = b) { … }`. – Marcelo Cantos Aug 03 '15 at 02:53
  • ok but I guess it's quick to mind-link the 2 and just tag then both as smelly practice. that's just my book though. I will certainly not downvote for personal preference :) – v.oddou Aug 03 '15 at 03:00
  • 4
    @v.oddou: This isn't just personal preference. It's a well-established idiom in the C++ community, with the huge benefit that it's impossible to accidentally use `p` when the weak_ptr's referee has been destructed. In fact, the [cppreference entry for weak_ptr](http://en.cppreference.com/w/cpp/memory/weak_ptr) uses exactly this technique. – Marcelo Cantos Aug 03 '15 at 03:05
  • I see ! I like this guarantee. I'll even think of adopting this style next time I work with weak refs. Until now I always made it two lines "auto p = x.lock(); if (p) {...}" simply because I kind of burned in my mind as a young programmer "no assignation in conditions" as a hard rule. I'm surely not the only one.. – v.oddou Aug 03 '15 at 03:14
  • 2
    @v.oddou, that's a silly rule, especially if the initialization is the result of a function call that could go either way. With your rule, the scope of p is artificially extended past the conditional. – jbruni Aug 03 '15 at 05:20
  • 1
    @jbruni: It seems pretty obvious to me that v.oddou gets it now. – Marcelo Cantos Aug 03 '15 at 05:23
  • @MarceloCantos with if(auto p = wp.lock()) allows me to modify wp? I mean if I modify p does wp change? (I think it shouldn't) When I use the pointer p, i'm using another pointer with the same value but it's not like a "reference", right? (Maybe like iterators?, they move between pointers without owning) – Umuril Lyerood Aug 03 '15 at 10:48
  • 1
    @UmurilLyerood I don't understand the question. Why would you want to modify `p`? Its purpose is to hold (for the duration of the block) a shared_ptr to the object referenced by `wp`. In any event, there's no special relationship between `p` and `wp`, and assigning either of them a different value won't affect the other. – Marcelo Cantos Aug 03 '15 at 11:09
  • @MarceloCantos I want to modify what wp points to. wp.reset(xxx), if then I use p, does what p "reset" too? – Umuril Lyerood Aug 03 '15 at 11:15
  • 1
    @UmurilLyerood I don't know how to make it any clearer than I already have. – Marcelo Cantos Aug 03 '15 at 11:26
2

Good question,

I find by experience that using shared_ptr/weak_ptr, and even shared_ptr at all by extension, is often resulting in some kind of a big blurred design.

Some people have advocated in the past that shared_ptr could be considered harmful, period. Because it makes ownership floating. And that ownership should be clear by design. I'm not sure if I want to appropriate myself this advice and advocate it too; but for sure I'll repeat it here for the sake of answering the question.

Also, one can consider that using shared_ptr everywhere is just the same as using garbage collection. It's just reference counting, but it behaves the same in the end. Only the performance is less good, it has been demonstrated in the past, a good garbage collection engine is faster than everything ref counted. (this is surely because of CPU atomic instructions and barriers needed in shared_ptr, but I speculate.)

Maybe you should consider moving to a truely well garbage collected language/platform, such as C# on .NET 4 ?

Otherwise, how about moving away from the direct pointers scheme and use identifiers, you can make a manager pattern, with your 2 indexing data structures private in this manager. And the clients only ever see and work with identifiers (int? uint64_t? at your discretion) through the API of the manager.

The problem I find with this approach is the need to repeat the whole API of your manipulated objects in the manager. This is painful and doesn't respect DRY.

Otherwise, have you considered that maybe your data structures are not inherently necessary ?
What I mean is that, when you store pointers anyway, the indexing data structur tends to not be so large. It is only N*sizeof(ptr_t) and not N*sizeof(value_t). Which suddenly makes std::vector an excellent candidate in all circumstances. I mean the vector is already the best data structure for almost all usages, there are many advocates of this theory.

If your vector only holds pointer, do yourself a favor and alleviate the shared_ptr overhead by using a boost::ptr_vector instead.

I hope I brought some perspective.

v.oddou
  • 6,476
  • 3
  • 32
  • 63
  • 1
    I'm not sure I understand what you are advocating here. You seem to be more saying what _not_ to use than what they _should_ use. If they use `boost::ptr_vector` for the main container the question remains what to use for the secondary container. – Chris Drew Aug 03 '15 at 02:50
  • What I advocate is to not use a second collection. Usually when we use 2nd collections it is because it fits a given query type. Like bi-maps. bi-maps a great because you can query by key or value with the same efficiency. What I say is fu** this, just do some linear searches (or sort your vector) but keep one collection. If possible of course. The OP is kind of loose in his scenario so we don't know the constraints. But from a basic design POV, we should respect KISS. – v.oddou Aug 03 '15 at 02:58
  • 6
    More than half the text above is completely irrelevant to the OP's question: the performance merits of reference counting vs garbage collection, suggestion to move to a garbage collected language, merits of vector as container, and is short on explanation on manager pattern and ptr_vector. (I downvoted, and it asked me to leave a comment, so I did). If the irrelevant parts were removed and the relevant parts expanded it would be a much better answer. – Nir Friedman Aug 03 '15 at 03:49
  • I agree with Nir. On the subject of `boost::ptr_vector`, the fact it [doesn't play nicely](http://stackoverflow.com/a/9472379/3422652) with `std::algorithms` is enough to put me off. – Chris Drew Aug 03 '15 at 04:05