12

You have a list of n integers and you want the x smallest. For example,

x_smallest([1, 2, 5, 4, 3], 3) should return [1, 2, 3].

I'll vote up unique runtimes within reason and will give the green check to the best runtime.

I'll start with O(n * x): Create an array of length x. Iterate through the list x times, each time pulling out the next smallest integer.

Edits

  • You have no idea how big or small these numbers are ahead of time.
  • You don't care about the final order, you just want the x smallest.
  • This is already being handled in some solutions, but let's say that while you aren't guaranteed a unique list, you aren't going to get a degenerate list either such as [1, 1, 1, 1, 1] either.
Dave Aaron Smith
  • 4,517
  • 32
  • 37
  • Why are you structuring the question like a competition? – Mark Peters Sep 21 '10 at 20:49
  • 5
    I don't know, seemed like a fun way to do it. – Dave Aaron Smith Sep 21 '10 at 21:00
  • 1
    Worst case is O(n * n) or O(n^2). But your algo is like a prematurely terminated selection sort – Bernie Perez Sep 21 '10 at 21:23
  • It is not obvious that the best runtime will be the best algorithmic complexity. Exemple: Aaron proposed to use a sorted skip-list for keeping the best x at all times. Looks like the best algorithmic complixity to O(n log x), but as the skip list usually involve a big constant factor, it is also probably not the fastest. – kriss Sep 21 '10 at 21:29

12 Answers12

13

You can find the k-th smallest element in O(n) time. This has been discussed on StackOverflow before. There are relatively simple randomized algorithms, such as QuickSelect, that run in O(n) expected time and more complicated algorithms that run in O(n) worst-case time.

Given the k-th smallest element you can make one pass over the list to find all elements less than the k-th smallest and you are done. (I assume that the result array does not need to be sorted.)

Overall run-time is O(n).

Community
  • 1
  • 1
George Eadon
  • 913
  • 5
  • 9
  • 1
    This assumes elements are unique. It gets a bit more complicated since if the kth-element is not unique, your selection breaks down. You would select any elements less than the kth-smallest, and then fill the rest of the array with the value of the kth-smallest. I believe the complexity remains the same. – Mark Peters Sep 21 '10 at 21:30
  • It gets more interesting if you're wanting to do some sort of order-preserving select (e.g., if you've really got compound values and are only comparing part of them, the key, and yet care about the payload). You can still do it in a single pass through the bulk of the data though, giving O(kn) (which tends to O(n) when k≪n). – Donal Fellows Sep 21 '10 at 21:38
  • @Donal, good point about order preserving. I will clarify that you don't care about the order, you just want the x smallest. – Dave Aaron Smith Sep 21 '10 at 21:41
8

Maintain the list of the x highest so far in sorted order in a skip-list. Iterate through the array. For each element, find where it would be inserted in the skip list (log x time). If in the interior of the list, it is one of the smallest x so far, so insert it and remove the element at the end of the list. Otherwise do nothing.

Time O(n*log(x))

Alternative implementation: maintain the collection of x highest so far in a max-heap, compare each new element with top element of the heap, and pop + insert new element only if the new element is less than the top element. Since comparison to top element is O(1) and pop/insert O(log x), this is also O(nlog(x))

Aaron
  • 692
  • 3
  • 8
  • I would probably use a [self-balancing binary search tree](http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree) instead of a skip list, but otherwise, that's the way I would go. – svick Sep 21 '10 at 21:22
  • @svick: The point of the skip list is that removals from the head are O(1). Of course, the list would have to be oriented slightly from Aaron's descriptions so that the maximum value is at the head and the smallest is at the tail instead of the other way around. A removal of the maximum value in a BST would be O(log(x)) which would not change the overall complexity, but would certainly add a higher constant factor. Plus, the rebalancing schemes themselves are sometimes more complex than relinking a node in a list. However, I wonder if there is a clever way to do this with a splay tree? – Brian Gideon Sep 22 '10 at 03:25
3

Add all n numbers to a heap and delete x of them. Complexity is O((n + x) log n). Since x is obviously less than n, it's O(n log n).

Mark Peters
  • 80,126
  • 17
  • 159
  • 190
  • No need to keep all of the numbers in the heap, just the N smallest so far. Let me expand on that. Use a max heap. Add a number. If the count > N, then remove the first item from the heap. – Jim Mischel Sep 21 '10 at 21:27
  • Yes, that was covered well by @Aaron so I'll leave this answer independent of that. – Mark Peters Sep 21 '10 at 21:33
3

If the range of numbers (L) is known, you can do a modified counting sort.

given L, x, input[]
counts <- array[0..L]
for each number in input
    increment counts[number]
next

#populate the output
index <- 0
xIndex <- 0
while xIndex < x and index <= L
   if counts[index] > 0 then
       decrement counts[index]
       output[xIndex] = index
       increment xIndex
   else
       increment index
   end if
loop

This has a runtime of O(n + L) (with memory overhead of O(L)) which makes it pretty attractive if the range is small (L < n log n).

Mark Peters
  • 80,126
  • 17
  • 159
  • 190
1
def x_smallest(items, x):
    result = sorted(items[:x])
    for i in items[x:]:
        if i < result[-1]:
            result[-1] = i
            j = x - 1
            while j > 0 and result[j] < result[j-1]:
                result[j-1], result[j] = result[j], result[j-1]
                j -= 1
    return result

Worst case is O(x*n), but will typically be closer to O(n).

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
0

Psudocode:

def x_smallest(array<int> arr, int limit)
    array<int> ret = new array[limit]

    ret = {INT_MAX}

    for i in arr
        for j in range(0..limit)
            if (i < ret[j])
                ret[j] = i
            endif
        endfor
    endfor

    return ret
enddef
Mentalikryst
  • 761
  • 4
  • 6
0

In pseudo code:

y = length of list / 2

if (x > y)
  iterate and pop off the (length - x) largest
else
  iterate and pop off the x smallest

O(n/2 * x) ?

Zilverdistel
  • 1,571
  • 11
  • 10
  • Ah, but the list doesn't come sorted. The smallest integer could appear way at the end. – Dave Aaron Smith Sep 21 '10 at 21:00
  • and sorting a list of x elements is probably kind of fast. – kriss Sep 21 '10 at 21:22
  • @erickson, correct, I didn't. @zilverdistel's algorithm depends on the entire list coming presorted. – Dave Aaron Smith Sep 21 '10 at 21:48
  • @Dave Aaron Smith - Sorry, I misread your comment as being about the order of results. – erickson Sep 21 '10 at 22:18
  • @Dave: I don't think it does, but his complexity is wrong. He's just doing a slight optimization that improves the worst case. Whereas the worst case of your trivial solution is O(n * x), it is O(n * n) as x -> n. This would be O(n * n/2) as x ->n. But that 1/2 is just a constant so it shouldn't be factored into the complexity analysis. – Mark Peters Sep 22 '10 at 13:31
0
sort array
slice array 0 x

Choose the best sort algorithm and you're done: http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms

Giacomo
  • 11,087
  • 5
  • 25
  • 25
0

You can sort then take the first x values?

Java: with QuickSort O(n log n)

import java.util.Arrays;
import java.util.Random;

public class Main {

    public static void main(String[] args) {
        Random random = new Random(); // Random number generator
        int[] list = new int[1000];
        int lenght = 3;

        // Initialize array with positive random values
        for (int i = 0; i < list.length; i++) {
            list[i] = Math.abs(random.nextInt());
        }

        // Solution
        int[] output = findSmallest(list, lenght);

        // Display Results
        for(int x : output)
            System.out.println(x);
    }

    private static int[] findSmallest(int[] list, int lenght) {
        // A tuned quicksort
        Arrays.sort(list);
        // Send back correct lenght
        return Arrays.copyOf(list, lenght);     
    }

}

Its pretty fast.

Bernie Perez
  • 12,513
  • 13
  • 46
  • 55
0
    private static int[] x_smallest(int[] input, int x)
    {
        int[] output = new int[x];
        for (int i = 0; i < x; i++) { // O(x)
            output[i] = input[i];
        }

        for (int i = x; i < input.Length; i++) { // + O(n-x)
            int current = input[i];
            int temp;

            for (int j = 0; j < output.Length; j++) { // * O(x)
                if (current < output[j]) {
                    temp = output[j];
                    output[j] = current;
                    current = temp;
                } 
            }
        }

        return output;
    }

Looking at the complexity: O(x + (n-x) * x) -- assuming x is some constant, O(n)

PeterL
  • 1,397
  • 11
  • 15
0

What about using a splay tree? Because of the splay tree's unique approach to adaptive balancing it makes for a slick implementation of the algorithm with the added benefit of being able to enumerate the x items in order afterwards. Here is some psuedocode.

public SplayTree GetSmallest(int[] array, int x)
{
  var tree = new SplayTree();
  for (int i = 0; i < array.Length; i++)
  {
    int max = tree.GetLargest();
    if (array[i] < max || tree.Count < x)
    {
      if (tree.Count >= x)
      {
        tree.Remove(max);
      }
      tree.Add(array[i]);
    }
  }
  return tree;
}

The GetLargest and Remove operations have an amortized complexity of O(log(n)), but because the last accessed item bubbles to the top it would normally be O(1). So the space complexity is O(x) and the runtime complexity is O(n*log(x)). If the array happens to already be ordered then this algorithm would acheive its best case complexity of O(n) with either an ascending or descending ordered array. However, a very odd or peculiar ordering could result in a O(n^2) complexity. Can you guess how the array would have to be ordered for that to happen?

Brian Gideon
  • 47,849
  • 13
  • 107
  • 150
  • Interesting. I'd never heard of a splay tree. I believe you meant `if (array[i] < max or tree.Count < x)`, yeah? As your pseudo code stands, I believe the splay tree would never get more than one integer if you encountered the smallest integer first. – Dave Aaron Smith Sep 22 '10 at 21:12
0

In scala, and probably other functional languages, a no brainer:

scala> List (1, 3, 6, 4, 5, 1, 2, 9, 4)  sortWith ( _<_ ) take 5
res18: List[Int] = List(1, 1, 2, 3, 4)
user unknown
  • 35,537
  • 11
  • 75
  • 121