6

I am reading the basics of splay trees. The amortized cost of an operation is O(log n) over n operations. Some rough basic idea is that when you access a node, you splay it i.e. you take it to root so next time this is quickly accessed and also if the node was deep, it enhances balance-ness of tree.

I don't understand how the tree can perform amortized O(log n) for this sample input:

Say a tree of n nodes is already built. My next n operations are n reads. I access a deep node say at depth n. This takes O(n). True that after this access, the tree will become balanced. But say every time I access the most current deep node. This will never be less than O(log n). then how we can ever compensate for the first costly O(n) operation and bring the amortized cost of each read as O(log n)?

Thanks.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
xyz
  • 8,607
  • 16
  • 66
  • 90
  • 1
    I am not sure how you arrive at O(n) for the first access... However I am not very familiar with splay trees. If you need to O(n) for the first access this would mean a tree containing n elements would be degraded (i.e. a linear list) is this correct for splay trees? – LiKao May 02 '11 at 06:52
  • @LiKao : May be you are right. A single operation may never be O(n), because the self-adjusting tree will ensure that the tree never degrades to a linear list. I myself am not sure of this. But may be it is possible to get sth like O(n/k) which is same as O(n).. – xyz May 02 '11 at 07:30
  • While interesting in theory, I would be wary of actually using this structure. Reads are cheaper than Writes, and the Splay Tree unfortunately write in most cases since it rebalances even on "read-only" operations. It may lead to performance issue. – Matthieu M. May 02 '11 at 14:57

1 Answers1

7

Assuming your analysis is correct and the operations are O(log(n)) per access and O(n) the first time...

If you always access the bottommost element (using some kind of worst-case oracle), a sequence of a accesses will take O(a*log(n) + n). And thus the amortized cost per operation is O((a*log(n) + n)/a)=O(log(n) + n/a) or just O(log(n)) as the number of accesses grows large.

This is the definition of asymptotic average-case performance/time/space, also called "amortized performance/time/space". You are accidentally thinking that a single O(n) step means all steps are at least O(n); one such step is only a constant amount of work in the long run; the O(...) is hiding what's really going on, which is taking the limit of [total amount of work]/[queries]=[average ("amortized") work per query].

This will never be less than O(log n).

It has to be in order to get O(log n) average performance. To get intuition, the following website may be good: http://users.informatik.uni-halle.de/~jopsi/dinf504/chap4.shtml specifically the image http://users.informatik.uni-halle.de/~jopsi/dinf504/splay_example.gif -- it seems that while performing the O(n) operations, you move the path you searched scrunching it towards the top of the tree. You probably only have a finite number of such O(n) operations to perform until the entire tree is balanced.

Here's another way to think about it:

Consider an unbalanced binary search tree. You can spend O(n) time balancing it. Assuming you don't add elements to it*, it takes O(log(n)) amortized time per query to fetch an element. The balancing setup cost is included in the amortized cost because it is effectively a constant which, as demonstrated in the equations in the answer, disappears (is dwarfed) by the infinite amount of work you are doing. (*if you do add elements to it, you need a self-balancing binary search tree, one of which is a splay tree)

ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • We will have to add rotation/splaying cost as well too, right? It looks like total reorganization cost of after 1st operation will be O(n) – xyz May 02 '11 at 06:34
  • That is fine. Consider an unbalanced binary search tree. You can spend `O(n)` time balancing it, and `O(log(n))` time searching it. Assuming you don't add elements to it*, it takes `O(log(n))` amortized time to fetch an element. **The balancing setup cost is included in the amortized cost** because it is effectively a constant which, as demonstrated in the equations in the answer, disappears (is dwarfed) by the infinite amount of work you are doing. (*if you do add elements to it, you need a self-balancing binary search tree, one of which is a splay tree) – ninjagecko May 02 '11 at 06:49