C++ how should I structure a tree

Question

I am currently working on a large data tree. I need to navigate the tree, return a subnode and change its value.

class Node {
    short value;
    std::vector<Node> children;

    Node walk(int step) {
        return children[step];
    }
}

Will the return of the Walk function create a copy of the child Node or do I have to return a pointer? How should I Link the Nodes?

also another question:

what's the difference between these two and which should I use to create new nodes:

 Node newNode;
 // or
 Node newNode = *new Node();

Edit

I tried std::vector<Node*> children; but this lead to memory leaks when deleting. Would Node& walk(int step) work too without using pointers?

First question, yes, it makes copies. Second question, the second code line is a recipe for a memory leak. — WhozCraig, Mar 26 '20 at 17:13
You link to other nodes in the tree (like the children) using *pointers*. That is, you need to create a vector of *pointers* to `Node`. — Some programmer dude, Mar 26 '20 at 17:14
I tried `std::vector children;` but this lead to memory leaks when deleting. Would `Node& walk(int step) ` work too without using pointers? — Moritz03, Mar 26 '20 at 17:19
@Moritz03 You can try to use a std::vector> to make things easier regarding memory management. — Suthiro, Mar 26 '20 at 17:25
`shared_ptr` has a lot of overhead. Only use it if you really are sharing [ownership](https://stackoverflow.com/questions/49024982/what-is-ownership-of-resources-or-pointers) of the node (and in a tree you probably aren't). I would start with your first code example and change `Node walk(int step)` to `Node & walk(int step)` and see if that met the project requirements because it's dead simple. Note: this approach can be extremely slow and wasteful if you add, move and remove nodes frequently. — user4581301, Mar 26 '20 at 17:27
@user4581301 if they is not constantly creating and deleting enormous amounts of nodes, I believe the overhead is negligible. See [answer](https://stackoverflow.com/a/22296124) — Suthiro, Mar 26 '20 at 18:09
@user4581301, I believe some of the overhead can be mitigated by using `std::make_shared` to create the shared_ptr. This reduces the cache invalidation by making the locality of reference better. — Abdus Khazi, Mar 26 '20 at 18:27
`make_shared` may help with one case, the counter with respect to the the node allocation, but can't help with locality between the nodes. Depending on the access pattern while traversing the tree that could be important. The use you, @AbdusKhazi , get out of shared and weak pointer in your answer's good for protecting against the node being invalidated while it's being inspected, but we don't know if that's a case the asker needs to worry about. Nor do we know if the tree is static or frequently rearranged. Or if the tree will be too large to be recursively destroyed by smart pointers. — user4581301, Mar 26 '20 at 18:40
Given what we little know, the general approach suggested by Some programmer dude is probably the best we should offer. — user4581301, Mar 26 '20 at 18:41

Abdus Khazi · Accepted Answer · 2020-03-26T19:40:57.277

To prevent memory leaks and keep your code clean, you can use shared pointers. Return weak pointer from walk function. We must return the weak pointer so as not to make the client the owner of a particular node.

#include<memory>
#include<vector>

class Node;
using NodePtr  = std::shared_ptr<Node>;
using NodeWPtr = std::weak_ptr<Node>;

class Node {
    short value;
    std::vector<NodePtr> children;

    NodeWPtr walk(int step) {
        //Your Algorithm.
        return children[step];
    }
};

Node newNode = *new Node(); creates a Node on the heap and then copies it to a node on the stack. This however, has memory leak and it does not do anything significantly different than just creating a node directly on the stack.

Node newNode; creates Node object on the stack. I would suggest you use this as it directly describes your intent.

Some people have suggested the use of std::unique_ptr instead of std::shared_ptr because std::shared_ptr has a large overhead. However, if we do use std::unique_ptr, we would have to return an std::observer_ptr from the walk function which is not fully implemented in the latest c++ compilers as far as I know.

EDIT

The following 2 class designs are noteworthy of consideration according to the discussions in the comments.

//Without storing any pointers in the vector
class Node {
    short value;
    std::vector<Node> children;

    Node* walk(int step) {
        //Your Algorithm.
        return &children[step];
    }
};

or

//Using unique_ptr
class Node;
using NodePtr  = std::unique_ptr<Node>;

class Node {
    short value;
    std::vector<NodePtr> children;

    Node* walk(int step) {
        //Your Algorithm.
        return children[step].get();
    }
};

My pitch would be no pointer at all and let the `vector` manage the whole thing, but I'm counting on a lot of assumptions (tree is built and then rarely changed - and certainly not while a node is in use - and short enough to recursively destroy being the big two) — user4581301, Mar 26 '20 at 18:44
`[…]because std::shared_ptr has a large overhead.` it is not about the overhead, but about semantics, you should only use `shard_ptr` if you really want/need to express shared ownership. `[…] if we do use std::unique_ptr, we would have to return an std::observer_ptr[…]` returning a raw point until then is still fine. It would be great to have `std::observer_ptr` because it is semantically a none owning pointer, and for a raw pointer you need to explain in your project that you e.g. only use it as none owning pointers. — t.niese, Mar 26 '20 at 19:02

C++ how should I structure a tree

1 Answers1