1

I had a lecture on Big Oh for Merge Sort and I'm confused.

What was shown is:

0 Merges [<----- n -------->] = n

1 Merge [<--n/2--][-n/2--->] = (n/2 + n/2) = n

2 Merges [n/4][n/4][n/4][n/4] = 2(n/4 + n/4) = n

....

log(n) merges = n

Total = (n + n + n + ... + n) = lg n = O(n log n)

I don't understand why (n + n + ... + n) can also be expressed as log base 2 of n and how they got for 2 merges = 2(n/4 + n/4)

5 Answers5

2

In the case of 1 merge, you have two sub arrays to be sorted where each sub-array will take time proportional to n/2 to be sorted. In that sense, to sort those two sub-arrays you need a time proportional to n.

Similarly, when you are doing 2 merges, there are 4 sub arrays to be sorted where each will be taking a time proportional to n/4 which will again sum up to n.

Similarly, if you have n merges, it will take a time proportional to n to sort all the sub-arrays. In that sense, we can write the time taken by merge sort as follows.

T(n) = 2 * T(n/2) + n

You will understand that this recursive call can go deep (say to a height of h) until n/(2^h) = 1. By taking log here, we get h=log(n). That is how log(n) came to the scene. Here log is taken from base 2.

Since you have log(n) steps where each step takes a time proportional to n, total time taken can be expressed as,

n * log(n)

In big O notations, we give this as an upper bound O(nlog(n)). Hope you got the idea.

Following image of the recursion tree will enlighten you further. enter image description here

Imesha Sudasingha
  • 3,462
  • 1
  • 23
  • 34
1

The last line of the following part written in your question,

0 Merges [<----- n -------->] = n

1 Merge [<--n/2--][-n/2--->] = (n/2 + n/2) = n

2 Merges [n/4][n/4][n/4][n/4] = 2(n/4 + n/4) = n

....

n merges = n --This line is incorrect!

is wrong. You will not have total n merges of size n, but Log n merges of size n.

At every level, you divide the problem size into 2 problems of half the size. As you continue diving, the total divisions that you can do is Log n. (How? Let's say total divisions possible is x. Then n = 2x or x = Log2n.)

Since at each level you do a total work of O(n), therefore for Log n levels, the sum total of all work done will be O(n Log n).

displayName
  • 13,888
  • 8
  • 60
  • 75
0

You've got a deep of log(n) and a width of n for your tree. :)

0

The log portion is the result of "how many times can I split my data in two before I have only one element left?" This is the depth of your recursion tree. The multiple of n comes from the fact that for each of those levels in the tree you'll look at every element in your data set once after all merge steps at that level.

recurse downwards:
    n unsorted elements
  [n/2][n/2] split until singletons...
  ...
merge n elements at each step when recursing back up
  [][][]...[][][]
  [ ] ... [ ]
  ...
  [n/2][n/2]
  n sorted elements
synchronizer
  • 1,955
  • 1
  • 14
  • 37
  • log base 2 of n is the number of times it takes to split your data set until only singletons remain. You have "n" per level because in all merge step at a particular level you end up iterating over the entirety of each subarray. Imssha's answer is more specific I admit. Hopefully I've helped a little. – synchronizer Apr 11 '17 at 02:23
0

It's very simple. Each merge takes O(n) as you demonstrated. The number of merges you need to do is log n (base 2), because each merge doubles the size of the sorted sections.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622