4

Often we need trees in algorithms and I get out a tree with lots of pointers and recursion.
Sometimes I need more speed an I put the tree into an 2D array like so:

Example of a binary tree stored in an array
+-----------------+
|0eeeeeeeeeeeeeeee| //no pointers needed, parent/child, is y dimension,
|11       dddddddd| //sibbling is x dimension of the array.
|2222         cccc|  //The 123 tree is stored root up.
|33333333       bb|  //Notice how the abc-tree is stored upside down 
|4444444444444444a|  //The wasted space in the middle, is offset by the fact 
+-----------------+  //that you do not need up, down, and sibbling pointers.

I love this structure because it allowes me speed up options that I don't have when using pointers and recursion.

But notice that wasted space in the middle....

How do I get rid of/reuse that wasted space?

Requirements
I only use this structure if I need every last bit of speed, so a solution with lots of translations and address calculations to get to that space will not be helpful.

Johan
  • 74,508
  • 24
  • 191
  • 319

1 Answers1

13

A binary tree can be stored in an array more efficiently as explained here: http://en.wikipedia.org/wiki/Binary_tree#Arrays:

enter image description here

From Wikipedia:

In this compact arrangement, if a node has an index i, its children are found at indices (2i + 1) (for the left child) and (2i + 2) (for the right), while its parent (if any) is found at index floor((i-1)/2) (assuming the root has index zero).

This method benefits from more compact storage and better locality of reference, particularly during a preorder traversal. However, it is expensive to grow and wastes space proportional to (2H - n) for a tree of height H with n nodes.

In other words, it will waste space if your tree is not a complete binary tree, but it will still be a bit more compact than a plain 2D array.

Community
  • 1
  • 1
vgru
  • 49,838
  • 16
  • 120
  • 201
  • 1
    +1 Groo exactly the info I was looking for and it's cool to have the sibblings in sequential order. – Johan Jun 17 '11 at 11:00
  • I disagree with wikipedia that it is expensive to grow. If I preallocate an address-space with VirtualAlloc and call in the physical memory in chucks, it's not expensive at all. And I don't need to move the tree around in memory. – Johan Jun 17 '11 at 11:03
  • @Johan: And when the tree outgrows your pre-allocated address space? – Vatine Jun 17 '11 at 11:12
  • @Vatine, then I'm stuck, but I have 4GB of virtual memory in 32 bit and exabytes of virtual memory in 64 bit, so I will run out of real memory long before I run out of virtual memory. That's the beauty of virtual memory :-). – Johan Jun 17 '11 at 11:18
  • 1
    @Vatine: I believe Johan wanted to say that the cost of dynamic array allocation depends only on the size of the chunks which you allocate. So, if you want it to be compact at all times, you will have to allocate often (copying all data each time). If you allocate larger chunks, you will only need to do this once or twice, it's not a big deal (but, on the other hand, it means you are wasting space). – vgru Jun 17 '11 at 11:18
  • @Groo, true, but you have complete freedom in the amount of space you choose to waste and you can always choose to reclaim space if you've found out your tree will not grow anymore. If your tree is growing you are not wasting space, but just allocating it early :-). – Johan Jun 17 '11 at 11:21
  • @JOhan: But address space set aside for your tree is not available for other uses. Sure, you can grab a chunk that "should be enough", but can you grow that virtual allocation once it's been filled? FWIW, the amortized cost for extending an array is O(1), if you double the size each time. – Vatine Jun 17 '11 at 11:23
  • @Vatine, the virtual address space is not available for other uses **in your application** every application has it's own virtual address space though, 2GB by default on Win32 per application. So I can run 10 applications that each reserve 1GB virtual memory with no problem. – Johan Jun 17 '11 at 11:30
  • @Johan: Yes, I am well aware of that. Still, it's only a couple of exabytes, so a very finite resource. I must admit to having a bias towards thinking big (I mean, if you use disk, you use a couple of hundred terabytes at each location, with a couple of dozen locations, right?). So, once you outgrow your 1GB allocation (that's only, what, 30 levels of tree, if each element is 8 bits large), you *still* need to re-allocate. – Vatine Jun 17 '11 at 11:44