10

I am trying to solve this problem..

Given a list of n numbers, we would like to find the smallest and the second smallest numbers from the list. Describe a divide-and-conquer algorithm to solve this problem. Assume that n = 2^k for an integer k. The number of comparisons using your algorithm should not be more than 3n/2 − 2, even in the worst case.

My current solution is to use select algorithm to get the median and then divide the list into L1( contains element less than or equal to the median), R ( median), L2 ( contains all the elements grater than median). Is it correct? If so, what should I do next?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Faisal
  • 165
  • 1
  • 10
  • The naive approach works fine: compare to the currently_lowest, (maybe) compare to the currently_2nd_lowest. < 3/2 * N compares expected. – wildplasser Oct 16 '13 at 23:47
  • @wildplasser I agree the expected time is good that way, but the statement specifically asks for worst case performance. Worst case would probably be something like [0,x,x,x,x,x,x,x,x] where you'd have to check both lowest and second_lowest for every element past 0. It's also not divide/conquer. – Geobits Oct 16 '13 at 23:55
  • @OP Your current solution is entirely correct. :) Just keep dividing the list up like that and keep operating on the sublist with the smaller elements. Basically make a recursive call to L1 until L1 ends up being a list of just 2 elements. Note that it's very similar to using the quickselect algorithm (k = 2) in combination with median of medians algorithm to choose the pivot. There is lots of information on this method and it's quite easy to implement. The only hard part is understanding how the subroutine `partition` works. – Shashank Oct 17 '13 at 01:17

2 Answers2

9

Note that the median-selection algorithm uses Θ(n) comparisons, but that doesn't mean that it uses at most 3n/2 - 2 comparisons. In fact, I think it uses a lot more, which probably rules out your solution strategy.

As a hint: think of this problem as building an elimination tournament for all 2k; the winner of each round (the smaller of the two numbers) advances to the next. How many comparisons are needed to implement that? Next, notice that the second-smallest number must have "lost" to the smallest number. The second-smallest number is also the smallest number that "lost" to the smallest number. Given this, could you efficiently find the second-smallest number?

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 1
    "It's also the largest number that "lost" to the smallest number." - I think you meant that the second smallest number is the smallest number that lost to the smallest number. – G. Bach Oct 17 '13 at 00:03
  • Thanks, I actually did not get the idea in your post.Ok, how about this outline: if n==2 compare and return the 1st, 2nd smallest. if n>2 minimum= call(i=1 to n/2), and minimum2= call(i=n/2 to n) and return the minimum values ... would that work? – Faisal Oct 17 '13 at 00:26
  • @Faisal- How would you return the minimum values from the two halves? Also, if you're not understanding my post, perhaps I should edit it. Was there anything in particular you didn't get? – templatetypedef Oct 17 '13 at 00:33
  • @Faisal- Nope, no mergesort. Do you understand how to structure a tournament for 2^n players so that you could determine who the overall winner is? – templatetypedef Oct 17 '13 at 00:43
  • Unfortunately, I did not understand how to structure a tournament for 2^n players. – Faisal Oct 17 '13 at 05:31
  • @Faisal- Pair up all the players and have each pair play. Each winner advances to the next round. Repeat until only one player remains. Does that help? – templatetypedef Oct 17 '13 at 06:38
  • For what I understand, we also need to store the numbers for which every (or at least the smaller one) element was compared to, right? This will give `(n-1) + (log(n)-1)` comparison, as noted by Evgeny also. The `n-1` comes from `n/2 + n/4 + ... + 1` – justhalf Oct 17 '13 at 10:38
  • @justhalf- Yep! Exactly. – templatetypedef Oct 17 '13 at 16:39
  • Thank you all for participating in solving this problem. Thanks @templatetypedef for your quick response and cooperation. – Faisal Oct 17 '13 at 23:44
3

Oh, I just understand it (in Python):

def two_min(arr):
    n = len(arr)
    if n==2: # Oops, we don't consider this as comparison, right?
        if arr[0]<arr[1]:                   # Line 1
            return (arr[0], arr[1])
        else:
            return (arr[1], arr[0])
    (least_left, sec_least_left) = two_min(arr[0:n/2])
    (least_right, sec_least_right) = two_min(arr[n/2:])
    if least_left < least_right:            # Line 2
        least = least_left
        if least_right < sec_least_left:    # Line 3
            return (least, least_right)
        else:
            return (least, sec_least_left)
    else:
        least = least_right
        if least_left < sec_least_right:    # Line 4
            return (least, least_left)
        else:
            return (least, sec_least_right)

In total:

Line 1: There will be exactly n/2 comparisons

Line 2: There will be n/4 + n/8 + ... + 1 comparisons here

Line 3 and Line 4: Exactly one of this will be executed per call to two_min (except when it's called with two elements). We are calling two_min in total n-1 times (since there are that many tournaments), with n/2 of them being called with two elements. So Line 3 and Line 4 contributes to n/2 - 1 comparisons

Combining all of them, we have:

total_comparisons = n/2 + (n/4 + n/8 + ... + 1) + (n/2 - 1)
                  = (n - 1) + (n/2 - 1)
                  = 3n/2 - 2
justhalf
  • 8,960
  • 3
  • 47
  • 74