0

So I have a simple snippet of C++ code which is SUPPOSED to insert a node into a binary search tree. It returns true if the value is successfully inserted and false if the value is already in the tree.

struct Node {
  int data;
  Node* parent = nullptr;
  Node* left = nullptr;
  Node* right = nullptr;
};

bool insert(Node& root, int data) {
  if (data > (root.data)) {
    if ((root.right) == nullptr) {
      Node newNode = {data, &root};
      root.right = &newNode;
      return true;
    } else {
      return insert(*root.right, data);
    }
  }
  else if (data < (root.data)) {
    if ((root.left) == nullptr) {
      Node newNode = {data, &root};
      root.left = &newNode;
      return true;
    } else {
      return insert(*root.left, data);
    }
  }
  else {
    return false; // if both values are equal 
  }
}

When testing my function I noticed something peculiar. When I don't print the function's return value, it gives the right answer (20):

  Node root = {50};
  insert(root, 20);
  cout << (root.left->data) << endl;

However, when I do print the return value, it gives the incorrect result (0):

  Node root = {50};
  cout << insert(root, 20) << endl;
  cout << (root.left->data) << endl;

I cannot possibly fathom why this happens, but my best bet is because of some weird memory hijinks, perhaps not allocating new memory for the struct? I come from Python where memory management is handled automatically so I'm still getting used to situations like this.

  • 3
    `Node newNode = {data, &root};` creates a local variable that is destroyed when it goes out of scope. Retaining a pointer to that variable is undefined behavior. – Retired Ninja Jun 01 '21 at 19:31
  • 2
    ***Different output depending on whether or not I print the return value*** This likely means some type of undefined behavior. Did the compiler issue any warnings? Edit: @RetiredNinja found the UB – drescherjm Jun 01 '21 at 19:31
  • Strangely, no it didn't generate warnings or errors, at least not with this set-up: https://godbolt.org/z/fr857M1j9 – user4581301 Jun 01 '21 at 19:37
  • 1
    *I cannot possibly fathom why this happens* -- C++ does not work in the way you believe it works. This: `Node newNode = {data, &root}; root.right = &newNode;` does not create some sort of permanent reference to `newNode` where some garbage collector sees that it still is in use and thus leaves it alone . You need to rewrite your code so that it actually creates Nodes using dynamic memory allocation in some sense. Right now, there isn't a single issuance of `new` or `delete` anywhere in the code, or any usage of smart pointers. – PaulMcKenzie Jun 01 '21 at 19:38
  • But yes, this is a case where you use dynamic allocation. `std::unique_ptr` is well suited to this case. Here's a great presentation on using smart pointers: https://www.youtube.com/watch?v=JfmTagWcqoE And if you wish to torture yourself unnecessarily, you can use `new`. – user4581301 Jun 01 '21 at 19:43
  • Here's some good reading on how to sort out in your head when to allocate storage from what pool: [Why are the terms “automatic” and “dynamic” preferred over the terms “stack” and “heap” in C++ memory management?](https://stackoverflow.com/questions/9181782) – user4581301 Jun 01 '21 at 19:47

1 Answers1

4

You invoke undefined behaviour right there:

  Node newNode = {data, &root};
  root.right = &newNode;

This stores, in your tree, the address of a stack variable. As soon as the function returns, it's not legal anymore to dereference this Node's children. From there, anything could happen.

You probably want something like:

  Node* newNode = new Node;
  newNode.data = data;
  newNode.root = root;
  ...
  root.right = newNode;

Edit: Remember that whenever you put a new in code, you need a matching delete. To avoid this hassle, the modern approach is to use unique_ptr. You should look into that. In your case, since you keep back pointers to the root, you'll need either shared_ptr and/or weak_ptr.

Jeffrey
  • 11,063
  • 1
  • 21
  • 42
  • 1
    Those problems were already present, just at a smaller scale. I believe quite strongly that all OPs need to learn specifically the immediate issue in order to grow. If we push them all the way to unique_ptr and other advanced constructs, IMO, they don't get to learn the whys. But good point, I'll edit with directions. – Jeffrey Jun 01 '21 at 20:06
  • The back pointer to root can just be a raw pointer, since it is not owning. – François Andrieux Jun 01 '21 at 20:17