1

Let's say I have a binary search tree with N nodes. Or an quad tree, or octtree, if these make any difference.

Then let's say I run an algorithm which changes all the keys in the tree. It seems to be complicated to move stuff around. Perhaps there actually is a good algorithm for this, but let's say I take the simple route: I iterate the tree, store the keys in a list, and rebuild the tree from scratch by repeated re-insertion. That is, a complete rebuild.

What time complexity can I expect to get for doing this kind of rebuilding? There are N nodes and each insertion takes log N time, but I can't wrap my head around what will happen as the tree grows.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
bombax
  • 1,189
  • 8
  • 26
  • 1
    Yeah a binary search tree, like I wrote above (just didn't wanna say it all the time... maybe I should have used a typedef... hahaha) – bombax May 07 '14 at 23:04

2 Answers2

3

Since insertion into a (balanced) BST takes Θ(log n), and the tree starts off with 0 nodes, then 1 node, then 2, etc., we get a running time of:

Θ(log 1 + log 2 + ... + log n)

From here, we can just directly say the running time is Θ(n log n) since: (related post)

log 1 + log 2 + ... + log n <= log n + log n + ... + log n
                             = n log n

log 1 + ... + log n/2 + ... + log n >= log n/2 + ... + log n
                                    >= log n/2 + ... + log n/2
                                     = n/2 log (n/2)

This is also related to tree sort.


We can actually prove that we can't do this faster than Θ(n log n):

It's been proven to require at least Θ(n log n) time on average to sort (using a comparison-based algorithm).

Since we can iterate over a BST in Θ(n), we can directly infer that we need at least Θ(n log n) time for the insertion.


Note - for a non-balanced BST, the running time would actually be Θ(n²), since the worst-case insertion time for an unbalanced BST is Θ(n), so we have:

Θ(1 + 2 + ... + n) = Θ(n(n+1)/2) = Θ(n²)
Community
  • 1
  • 1
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • Though to confirm, for Quad or Octtree the base of the log would be 4 and 8 respectively right? So would that make them a bit faster compared to binary trees? – mankadnandan Jun 18 '23 at 01:56
  • 1
    @mankadnandan If you have a quadtree instead of a BST, your tree would have (roughly) half the depth, but at every node you'd need to perform double as many operations (2 comparisons instead of 1). So a quadtree (or octree) should be about equally as fast (but more complex). You're otherwise correct about log base, but log base doesn't matter because this is just a constant factor and we don't include constant factors in time complexity. Differences in constant factors would be too difficult to account for, because e.g. the time differences between different operations are machine-dependent. – Bernhard Barker Jun 19 '23 at 08:24
  • I see your point there. Thanks for clarification! – mankadnandan Jun 22 '23 at 06:07
0

Assuming that you are using a self-balancing tree, then you've already answered your own question, "There are N nodes and each insertion takes log N time." It's important that the tree be self-balancing, since that is what allows you to make the statement that each insertion takes log N time. With an unbalanced tree, the insertion time could be linear worst case.

user3386109
  • 34,287
  • 7
  • 49
  • 68