Compare space complexity of these binary tree node implementations?

Question

Implementation 1:

struct treenode{
    int data;
    struct treenode children[2];
}

Implementation 2:

struct treenode{
    int data;
    struct treenode *children;
}

Here's what I'm thinking:

Implementation 2 would be better in terms of space complexity because

Undefined children wouldn't need a malloc in imp. 2, while (I presume) the arrays in imp.1 automatically malloc space for two treenodes on declaration?

However, each node ultimately just carries an int and a pointer for both implementations, so the space complexity can't differ that much, right? I used implementation 1 to avoid the possibility of using double pointers and getting confused (lol...), and my program ended up getting "Killed" when I tried building a tree with a large amount of data. Many of my classmates didn't have this problem, and I deviated from how the prof hinted that we should build the tree (using implementation 2).

Is there a good chance that I had this issue because I used implementation 1? Why is it so much more problematic if both nodes ultimately carry the same thing? Is it because big data leads to a ton of leaf nodes that get malloced in one case (1), but not in the other (2)? Or am I just completely wrong with my interpretation of how the space complexity differs in each case?

When talking about space complexity, you should also take into account that every dynamic memory allocation (i.e. malloc) takes additional space for internal bookkeeping (i.e. overhead). My guess is that each allocation would require 8-32 bytes, however that is a wild guess of mine and depends on the implementation. — Andreas Wenzel, Dec 03 '19 at 00:16
The malloc overhead would increase the space required by some constant factor but wouldn't change the *complexity*. — Willis Blackburn, Dec 03 '19 at 00:18
But if we had more leaf treenodes, wouldn't that mean that the extra mallocs would grow with a greater input (data) size? — Stoodent, Dec 03 '19 at 00:20
Re implementation 1: When you malloc a treenode, the space for the two children is also malloc'ed at the same time (a single block of memory.) Those two children will also be garbage-filled, so you would need to be sure to zero initialize that memory (or use calloc.) The more conventional method is your implementation 2, pointers. It will be smaller, conventional, and less prone to erronous data. Also, Relinking (deleting/adding) node is easy with pointers, but a PITA with your array. — Level 42, Dec 03 '19 at 00:26
I'm still trying to grok how implementation 1 would even work given that it recursively includes two of itself and would therefore be infinitely large. But generally a simple program will malloc once for each tree node, therefore, the amount of memory required is proportional to the number of nodes. If we did something like malloc one block of memory for all the nodes all up front, then the malloc overhead per node would be basically zero, but the amount of memory required by the program would still be proportional to the number of nodes. — Willis Blackburn, Dec 03 '19 at 00:26
Edit: my bad, I didn't notice that was a reply to Andreas (and I have no idea what malloc overhead is lol) — Stoodent, Dec 03 '19 at 00:26
@Level42 But each of those children would include two more children, etc. — Willis Blackburn, Dec 03 '19 at 00:27
@WillisBlackburn It did work in my case - arrays are pointers, so I'm thinking that each node was storing a pointer rather than the x bytes that two new nodes take up (Well it worked for smaller data) — Stoodent, Dec 03 '19 at 00:29
@Level42 Yeah, I didn't think of that. Luckily our assignment just entailed building a massive tree and being able to search it. — Stoodent, Dec 03 '19 at 00:31
@Stoodent I don't see how implementation 1 could have worked at all. Could you post the complete program? — Willis Blackburn, Dec 03 '19 at 00:33
@Stoodent, arrays are *storage* when you allocate them, either with malloc or on the stack. In your case, each element of the array will be sizeof(struct treenode). So struct treenode foo[2] is (sizeof(struct treenode) * 2) — Level 42, Dec 03 '19 at 00:36
@WillisBlackburn, well that's a whole other problem. I was just trying the resolve the array vs pointer thing. His Implementation 1 would never compile in the first place. — Level 42, Dec 03 '19 at 00:42
@WillisBlackburn am I able to attach a zip here somehwere? My assignment was too spaghetti to put on my github lmao — Stoodent, Dec 03 '19 at 00:43
Wait......... I didn't even do it like I said in implementation 1, I used two treenode pointers named leftchild and rightchild. Sorry for the massive troll everyone!!!! I did this assignment a month ago... — Stoodent, Dec 03 '19 at 00:46
Okay, that sounds more like it. So implementation 1 has two pointer fields and what's the other option, a `struct treenode` pointer that points to a buffer of two `struct treenode`s? — Willis Blackburn, Dec 03 '19 at 00:50
Yep. I think I may have even followed the teacher's suggestion. My program getting killed may have had to do with how I implemented my functions, etc. then — Stoodent, Dec 03 '19 at 00:56

Willis Blackburn · Accepted Answer · 2019-12-03T00:31:13.753

1

Have you tried to compile implementation 1?

Let's think about this for a minute. What's the size of struct treenode in implementation 1?

Let's say data is 4 bytes. What is the size of children? It's a 2-element array of struct treenode so it must be 2 times the size of struct treenode. So what's the size of struct treenode?

Let's say data is 4 bytes... wait a minute. You can see the problem here.

When I wrote this answer, implementation 1 was defined as:

struct treenode{
    int data;
    struct treenode children[2];
}

I suspect this is not what was in the actual program that the OP wrote, since it would not have compiled. Perhaps he meant:

struct treenode{
    int data;
    struct treenode *children[2];
}

But in that case I'm wondering what implementation 2 was.

edited Dec 03 '19 at 00:31

answered Dec 03 '19 at 00:14

Willis Blackburn

8,068
19
36

Aren't arrays pointers though? Rather than storing the actual treenodes in each node, wouldn't we be storing the pointer to the memory location of a block of memory holding 2 treenodes? – Stoodent Dec 03 '19 at 00:24
@Stoodent: Arrays [decay](https://stackoverflow.com/questions/1461432/what-is-array-decaying) to pointers under certain conditions. However, declaring an array is generally not the same as declaring a pointer. – Andreas Wenzel Dec 03 '19 at 00:29
@AndreasWenzel so arrays can be "flattened"/"casted" to pointers but in my case each node is actually storing the bytes of its child node? – Stoodent Dec 03 '19 at 00:36
`in my case each node is actually storing the bytes of its child node` - and those child nodes store the byes of their childs, and their childs store the bytes of their child and those childs store the bytes of their great-great-grandsons and those great-great-grandsons store the bytes of their great-great-great... so the size of struct would be, like, infinite. – KamilCuk Dec 03 '19 at 00:39
@Stoodent: I'm not sure if the term "flattened" is the appropriate term, but yes, in your implementation 1, your node is actually (recursively) storing the bytes of its children nodes, whereas implementation 2 only contains a pointer. – Andreas Wenzel Dec 03 '19 at 00:41

Compare space complexity of these binary tree node implementations?

1 Answers1