10

I have a directed acyclic graph implemented by Graph and Node classes. Each node has a list of pointers to childern and a list of pointers to parents. I added parents recently because some algorithms demanded fast access to parent list and the graph is small, just few connections per node, so there's no memory issue.

The Child list uses std::shared_ptr so that nodes are kept in memory at least as long as they have parents. But I don't want a node to own its parents, so I used weak_ptr for the pointers to parents.

But then there was a problem with the algorithms. An algorithm must create a new shared_ptr from the weak_ptr, so I can't directly use operator==, and using standard functions such as std::find() require writing a lambda function which called my_weak_ptr.lock() and then compares it to some shared_ptr.

If I switch to shared_ptr, any small bug in the code reponsible for node removal may lead to a memory leak. Or if I have a pointer to a node already removed, the code will be able to access a node not supposed to exist, so finding some bugs may become a lot harder. But working with shared_ptr is as safe as weak_ptr in terms of not dereferencing/deleting/etc. when not supposed to, (so it's better than raw C++ pointer) and std::find() can be used directly because shared_ptr can be dereferenced, unlike weak_ptr.

Is there a "better" design here, or it's an issue of this specific situation depending on e.g. how much does it matter if I do the extra operation of weak_ptr::lock() or take a risk of hard-to-find bugs?

  • Why aren't you removing the references from the children when the parent is destroyed? – Yochai Timmer Jan 28 '13 at 13:01
  • When a node has no parents and is a candidate for automatic deletion, by that time its parent list should be empty and of course no node points to it. But if by mistake something doesn't happen right, if there's any kind of bug there - it would be hard to find. What I'm saying is, children lists and parent lists have to be synchronized perfectly. – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 13:15
  • But with weak_ptr you care less because the only bugs there are visible in the GUI representation of the graph, are there's no memory issues to worry about. With shared_ptr parent list, either you trust the lists to be correct, which is risky, or you have to test the lists, then e.g. throw exceptions. Which means slower algorithms.... it's a tradeoff, that's why I'm asking this question – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 13:18
  • What's wrong with the lambda? – Andy Prowl Jan 28 '13 at 13:27
  • With std::find_if() taking a lambda, when you search through a parent list, each weak_ptr has to be lock()ed and then the resulting shared_ptr compared to the other function parameter. As long as it's not too slow, it's fine, but it's easier and faster to compare shared_ptrs directly using operator== automatically called by std::find() (for other opeations too, with lambda or withour - you have to lock() every time) – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 13:32
  • The locking inside the lambda should not do too much harm to your performance, it's a rather lightweight operation. – Arne Mertz Jan 28 '13 at 13:47
  • The answer of mine to another question might help you a bit: http://stackoverflow.com/a/8614314/160206. – Björn Pollex Jan 30 '13 at 19:52

2 Answers2

13

As you said yourself, using shared_ptr in both directions will create circles that create mem leaks and are hard to find and to break - you will lose (nearly) all of the benefits shared_ptr provides. So weak_ptr it shall be.

You say your algorithms have to lock the weak_ptr - I beg to differ. The algorithms have to get a parent shared_ptr from the node. It's the node's task to lock the parent weak_ptr and give back the result, either set correctly to a parent node or to NULL.

It's an implementation detail wether the nodes store their parents as shared_ptr or weak_ptr. Encapsulate that detail by only providing shared_ptrs to any clients.

class Node
{
  /* ... */
  std::weak_ptr<Node> parent;
public:
  std::shared_ptr<Node> getParent()
  {
    return parent.lock();
  }
};

Edit: Of course conceptually the same applies if there's more than one parent.

Edit2: In the comments you mention algorithms iterating over your parents list, making it necessary to write lambdas for each algorithm. If you use those algorithms often, consider to write an iterator adapter that automatically locks the target weak_ptr and gives back a shared_ptr:

template <class WPIterator>
struct LockTheWeakIterator
{
  //static_assert that WPiterator's value_type is some weak_ptr
  //typedef all those iterator typedefs
  typedef typename WPIterator::value_type::element_type element_type;

  shared_ptr<element_type> operator*()
  { return iter->lock(); }

  //provide all the other operators - boost.operators might help with that...

  WPIterator iter;
};

template <class IT>
LockTheWeakIterator<It> lockTheWeak(It iter);


//somewhere...
auto theParentIter = std::find_if(lockTheWeak(parents.begin()), 
  lockTheWeak(parents.end()), 
  whatIAmLookingFor);
Arne Mertz
  • 24,171
  • 3
  • 51
  • 90
  • I agree. In the higher-level Node methods, there should be no weak_ptr. But the parent-list iterators do point to weak_ptr, so at the lower level there's direct use of lock(). (Otherwise I'll have to write my own std::list::iterator wrapper just for the convenience of not using lock()). Great answer – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 13:48
  • Indeed looks simple enough to write once and use everywhere - it's actually useful for any STL container storing weak pointers, worth having it even as a shelf class to use every time the need comes :) – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:01
  • It's even better than that, I already have begin(), end(), etc. wrappers so you don't even need to type LockTheWeak, the begin() wrapper directly returns a shared_ptr. Thanks for the answer! – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:05
3

Most directed acyclic graphs should not need weak pointers to their parents at all, but work with plain pointers. In both cases it is each node's responsibility to delete themselves from each client's parent list once they get deleted. If you need to retrieve a shared pointer from some parent pointer in some special situation you can use std::shared_from_this in a similar way as you'd use lock() right now. This way you save the work to create and handle shared pointers all over the place, but only where you need them.

Stacker
  • 1,080
  • 14
  • 20
  • 1) plain pointers don't throw exceptions, they just make weird segfaults. 2) I need nodes to share ownership of the children, so I use shared_ptr in the data structure. And I need parent lists but I don't want children to own the parents, that why I use weak_ptr there in the data structure – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:10
  • 1
    @1: Changes to the parent list should be done in very few central functions which you can prove to be correct. E.g. in the Node's dtor you want to call something like `for (auto& c : _children) c->unlinkFromParent(*this);` and thats it. @2: I was only talking about the parent list, the child list should still contain shared_ptrs to automatically manage lifetime of children. – Stacker Jan 28 '13 at 14:15
  • 1
    Changes to parent list happen every time there's a change to a child list, it means nearly all graph operations change parent lists. Also, raw pointers are not safe. You can easily end up **delete**ing memory. Maybe I'm a paranoid, but to me it feels like a time bomb. Using a smart pointer lowers thhe chance of unexpected results. And the only cost is calling lock(). As long as it's not too slow, it's definitely worth it. Personally raw pointers are the least wanted option for me, it's unnecessarily "more low-level" – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:23
  • It is not just about calling lock. The above accepted answer does not do any checking of the value returned by lock at all, so you'll get the same access violations if you mess things up :) Also, you have to adjust the parent list in both cases, whether you use raw pointers or weak_ptrs. The different however is, that with raw pointers you don't have to temporarily lock the parent object (which is guaranteed to be valid when always calling unlinkFromParent instead of manually messing up the parent list) each time you only want to take a quick look into it (like reading some member value). – Stacker Jan 28 '13 at 14:29
  • 3
    Additionally I never said that you should expose raw pointers to the outside world. I only proposed using them internally and just in the list of parents for each node. Thus you'll never end up deleting memory which has already been deleted (unless you explicitly write delete somewhere, which would be wrong) – Stacker Jan 28 '13 at 14:35
  • I can check the value of lock() inside the method and throw an exception if it's invalid, telling me exactly where the problem is. Also, weak_ptr has the expired() method. You don't have this functionality with raw pointers. – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:38
  • Also, if I don't expose a raw pointer, it means I have to make a shared_ptr from it. But that's just like calling lock(), you just transfer the call from the high-level methods to a low level one, but it's the same operation – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 14:44
  • Well, generally you can either check consistency explicitly (by calling lock) or guarantee consistency implicitly by design (using very few methods for basic modifications of your data structure). Regarding the issue of the interface: You usually need to access data within the type's methods (aka implementation) way more often than from outside. I remember Herb Sutter briefly talking about the use of shared_ptrs [here](http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/C-11-VC-11-and-Beyond) at 00:38. Its not the same but also a tree data structure and Node* parent is perfectly fine. – Stacker Jan 28 '13 at 14:57
  • Also, Herb Sutter isn't God and he works for Microsoft, so I don't care what he says. Maybe if it was someone without an economical interest... like Richard Stallman :) – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 15:19
  • Sure he is not and I don't care for MS, but that happened to be the first video to come into my mind which I could link :P Anyway, I don't know your exact implementation and from what you described it sounds like a good candidate to still use raw pointers for the parent list. Feel free to like (or hate) this alternative approach ;o – Stacker Jan 28 '13 at 15:29
  • For now I'll be using weak_ptr, but I'll keep a link to this discussion in my directed-acyclic-graph of browser bookmarks :) – cfa45ca55111016ee9269f0a52e771 Jan 28 '13 at 15:51