2

I know that the go-to rule is to always use shared_ptrs or other smart pointers. However, I'm not sure how exactly to implement it without copy ctor'ing with std::make_shared. I'll elaborate:

The problem

I want to create a tree-like expression, like:

struct foo{
    std::vector<foo*> v;
};

This allows me to do the following:

foo add(foo& f1, foo& f2){
    return foo {&f1, &f2};
}

That's good, because then the two will become children of a new node.

I want the following to be evaluated correctly:

foo x, y;
auto z = add(x,y);
assert(z.v[0] == &x && z.v[1] == &y);

This way, there's no copying and it's all good. However, we have a resource management issue here.

Addressing the resource management issue

If these elements are allocated on the heap, and someone unknowingly might, then we'll run into a memory leak. So the best way to do this is to use smart pointers for RAII:

struct foo{
    std::vector<std::shared_ptr<foo> > v;
};

foo add(foo& f1, foo& f2){
    return foo {std::make_shared<foo>(f1), std::make_shared<foo>(f2)};
}

This is good, but we're calling the copy ctor of f1 and f2 here. Suppose copying takes a long time and we don't want to incur that cost.

We definitely, cannot do:

foo add(foo& f1, foo& f2){
    return foo {std::shared_ptr<foo>(&f1), std::shared_ptr<foo>(&f2)};
}

because whenever we remove z in z = x+y, we will call delete on each of x and y, which could be allocated on the stack. And we definitely do not want to delete things on the stack.

So what should I do in this case?

Let me know if I should supply more information on the context.

OneRaynyDay
  • 3,658
  • 2
  • 23
  • 56
  • I think it is very bad idea to store pointers to something you have no control over it. So just store the copies of foo objects as any stl::container will do. I think you should think in terms of memory management emulating memory management of std::list. If user for some reason will want to store pointer he will create your tree instead of tree - but for you it should be transparent. – Artemy Vysotsky Sep 04 '17 at 05:14
  • I don't want the user to have that ability to switch between pointers and instances though. If I have `vector`, then that means all insert operations will use either move semantics or copy ctor, and that could be slow. – OneRaynyDay Sep 04 '17 at 05:21
  • Have you tested this conjecture? It is very possible that throwing away contiguity of data, and thus easy cache loading, will cost you much more time due to the increased odds of cache misses. Sure, the CPU does less work on copying, but more work pointer-chasing. The program runs a significant risk of sitting around waiting on storage. – user4581301 Sep 04 '17 at 05:40
  • @user4581301 that is a fair point. I have not. However, with arbitrary size of copy, there will be a point where pointer-chasing is less time spent. For example, if each one of these `foo` also had a member that is a giant matrix. Perhaps it would be smarter design to then have reference/ptr on the giant matrix instead, I honestly don't know. – OneRaynyDay Sep 04 '17 at 05:44
  • Don't use `shared_ptr` unless you absolutely have to (when you do not know which object needs to delete it). The "go-to rule" is to use a `unique_ptr` to express ownership (if only one object is responsible for its life time). – Galik Sep 04 '17 at 06:12
  • @Galik, I was wondering about that. I wasn't sure if `shared_ptr` was necessary in this case, but say `auto z = add(x,y)` and then later `auto w = add(y,z)`. Who's going to be the owner here of y? Is it `z` or is it `w`? To me it's not clear, so I chose `std::shared_ptr`. It'd be nice to use `std::unique_ptr` though. – OneRaynyDay Sep 04 '17 at 06:15
  • If sharing nodes between two different trees makes sense in your application, then `shared_ptr` is what you need. But remember, in that case, whatever happens to shared nodes is shared between all trees that contain them. So it all depends on how you want it to behave. – Galik Sep 04 '17 at 06:19
  • I suspect that usually people like to **copy** branches from one tree to another and that sharing branches is less common. – Galik Sep 04 '17 at 06:21
  • @Galik for the problem statement I put the assumption that it was a tree, but in actuality I'm making a directed graph. perhaps that would make more sense to share a common node `x` that could branch out in 2 ways, and then merge back to some other variable. – OneRaynyDay Sep 04 '17 at 06:25
  • There is a potential issue if your node connections can be cyclic. Smart pointers don't work well in that situation. This question addresses that. https://stackoverflow.com/questions/27348396/smart-pointers-for-graph-representation-vertex-neighbors-in-c11 If you have cyclic connections you can put all the nodes in a different container and just put *raw pointers* in the graph to represent the connections. – Galik Sep 04 '17 at 06:36
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/153588/discussion-between-oneraynyday-and-galik). – OneRaynyDay Sep 04 '17 at 06:42
  • There is no resource management problems in the given code. Yes, `foo` takes no ownership of the items passed and makes calling code responsible for ensuring that stored pointers are valid for the lifetime of given `foo` instance and object pointed by those pointers are destroyed correctly. But that isn't a problem, it's just a limited responsibility. I guess "If these elements are allocated on the heap, and someone unknowingly might, then we'll run into a memory leak." sentence is not complete or something. The only issues seems to be the storing of raw pointers instead of reference wrappers. – user7860670 Sep 04 '17 at 09:18

1 Answers1

3

Accept by smart pointer. Applying the notion of smart pointers correctly means you have to think about the ownership semantics of your data, and not about calling new/delete automatically.

You accept a reference to a non-const object. This does not imply sharing, only modification of the objects. And as you correctly noted, just creating smart pointers to them spells disaster.

The caller of add (even if it's just you) must know the data they pass is about to be shared. And the only way for them to know is if they pass the smart pointer themselves. So you should change add into this:

foo add(std::shared_ptr<foo> f1, std::shared_ptr<foo> f2){
    return foo{f1, f2};
}

Now, the only way a user of add can blow this up is intentionally. But they most likely won't. Furthermore, now said user has much more flexibility in calling your code. They can control the allocator and deleter for each object they pass.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • This forces the tree to contain only heap allocated objects. So what about stack allocated objects? Is there no way to force the tree to contain only stack allocated objects? Or even better, is there a way for it to contain both safely with no memory leak? – Jerry Jeremiah Sep 04 '17 at 05:16
  • @JerryJeremiah - It forces nothing. I specifically finished by saying the caller can control the allocator and deleter. – StoryTeller - Unslander Monica Sep 04 '17 at 05:17
  • @JerryJeremiah That's exactly what I was about to ask. i.e., doesn't this force the user to allocate on the heap? By making the only interface to the user via `shared_ptr`, there's no way I can pass in something like `add(f1, f2)`. – OneRaynyDay Sep 04 '17 at 05:17
  • @OneRaynyDay - Stop worrying about the "heap" and worry about the ownership semantics. There's nothing sacred about the "stack". – StoryTeller - Unslander Monica Sep 04 '17 at 05:25
  • @OneRaynyDay - It seems you are too concerned with "optimizing" your code before its even finished being written. Remember that premature optimization like this is to be avoided. – StoryTeller - Unslander Monica Sep 04 '17 at 05:28
  • @StoryTeller I agree, which is why I am currently using a vector of shared_ptrs, and constructing them via `make_shared` inside the add like in the first example. I am just curious though. And I do agree with your point that there's nothing "sacred" about the stack, but I do think that since stack is a type of "ownership" in itself(the stack will allocate and deallocate), we shouldn't create `shared_ptr`'s that will also own it if the user were to do something like `foo f1;`, rather than `foo f1 = new foo()`. If we only accept smart ptr ownership, then we can't allocate on stack. – OneRaynyDay Sep 04 '17 at 05:30
  • 2
    @OneRaynyDay - `std::shared_ptr` can be constructed in a multitude of ways. You can pass a custom deleter functor. Which can be a no-op. So putting objects with automatic storage duration into a `foo` is also possible. – StoryTeller - Unslander Monica Sep 04 '17 at 05:38
  • @StoryTeller Okay, that's fair. I forgot about that. I'll accept your answer, thank you :) – OneRaynyDay Sep 04 '17 at 05:45