20

I have got an array containing unique elements. I need to find out the first n largest elements in the array in the least complexity possible. The solution that I could think of so far has a complexity of O(n^2).

    int A[]={1,2,3,8,7,5,3,4,6};
    int max=0;
    int i,j;
    int B[4]={0,0,0,0,};//where n=4;
     for(i=0;i<A.length();i++)
       {
         if(A[i]>max)
          max=A[i];
       }
     B[0]=max;
     for(i=1;i<n;i++){
       max=0;
       for(j=0;j<A.length();j++){
         if(A[j]>max&&A[j]<B[i-1])
            max=A[j];
       }
        B[i]=max;
     }

Please, if anyone can come up with a better solution which involves less complexity, I will be highly grateful. And I don't intend to change the original array!!

duedl0r
  • 9,289
  • 3
  • 30
  • 45
Poulami
  • 1,047
  • 3
  • 13
  • 27
  • 3
    `sort` the array using your favorite `O(n log n)` algorithm and select the N biggest elements? – GWW Sep 01 '11 at 15:26
  • 2
    what are "the first n largest elements"? The `n` largest? or the first `n`? – blubb Sep 01 '11 at 15:27
  • http://en.wikipedia.org/wiki/Selection_algorithm – nibot Sep 01 '11 at 15:27
  • 1
    Your complexity is not O(n^2).. it's O(n*k) k being the number of largest elements you want. it's only O(n^2) if n = k. So, if k is small enough your algorithm is not that bad.. – duedl0r Sep 01 '11 at 15:37
  • go to profile, select one of questions and then need to checkmark one of answers that best answers your question. do that with all of your questions. – fazo Sep 01 '11 at 15:38

8 Answers8

38

Find the kth biggest element, using selection algorithm.
Next, iterate the array and find all elements which are larger/equal it.

complexity: O(n) for selection and O(n) for iterating, so the total is also O(n)

amit
  • 175,853
  • 27
  • 231
  • 333
  • 1
    But if the array size is `n` and you also need to find the `n` largest elements, then this will run `n` times, each is O(n), so in total it will be O(n^2)... It would be better to sort it. – shinzou May 22 '16 at 14:29
  • 3
    @kuhaku No. You find the `n`th largest element in `O(n)`, let it be `x`, and then in a single more iteration find all elements with index `i` such that `arr[i] >= x`. This is also done in `O(n)`, and it yields total of `O(n + n) = O(n)`. – amit May 22 '16 at 16:04
  • How in a single iteration do you find all the others? Isn't it a loop from 1 to n where in each iteration you do `selection(n-i)`? – shinzou May 22 '16 at 21:50
  • 2
    @kuhaku You don't you found one element using selection, and you know its value. You then only iterate to find all elements bigger than it, no selection is needed (Note that the output is not sorted, you get elements in undefined order, which are the biggest elements, but their internal order is undefined) – amit May 23 '16 at 16:25
  • Sorry to dig up this question, but wouldn't it be faster to do just 1 pass through the array with a subset of k maximums? For each new value you compare with the min of the subset, if it's bigger you store it and search the new subset min, if it's not you move on. If I'm right that would be a best case of `k*k+(n-k)` iterations and a worst case of `k*n`iterations – Florent Aug 31 '17 at 00:13
  • 1
    @FlorentEcochard I am not sure I understand your approach, but `k*k + (n-k) = (k-1)*k + k + n-k = (k-1)*k + n`, and this is `O(k^2 + n)`, which is worse than `O(n)` especially when `k->n`, where you then get `O(n^2)` time, compared to `O(n)` using selection. – amit Aug 31 '17 at 07:31
  • Well my approach is to go through the array only once, and keep a subset of `k` maximums, as well as the minimum value of this subset (let's call it `m`). For each new sample of the array, you compare it with `m` and if it's bigger you replace it. You now have to find the new `m` of your subset. If we set `k=10` and `n=256`, that gives us a best case of `(k-1)*k + n = 346` and a worst of `k*n = 2560` iterations. I'm not familiar with QuickSelect but wikipedia says it has a WC of `O(n^2)`, and you also have to go through the whole array a second time in your method which adds 256 iterations – Florent Aug 31 '17 at 07:50
  • @FlorentEcochard (1) The worst case of quickselect is highly unlikely, and the average case is O(n) for it. (2) If you are afraid of it, you can use median of medians technique to get O(n) worst case performance. (3) I think you might be referring to a naive variant of the heap solution suggested [here](https://stackoverflow.com/a/10751982/572670) – amit Aug 31 '17 at 10:21
13

The usual trick to select the n largest elements is to maintain a min-priority queue.

  • Unconditionnally insert into the queue the n first elements
  • For each remaining element x, insert x if it is greater than the least element of the queue (O(log n) operation), and remove the least element (O(log n)).
  • When done, the priority queue contains n elements, which are the n largest elements of the original array.

Total complexity: O(N log n) where N is the total number of elements in the array.

I leave to you as an exercise the implementation details (first step is to learn about priority queues, and implement one).

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • 3
    Note that although this is slower than @amit's QuickSelect based solution, it's an online algorithm applicable to an unbounded input stream – Yibo Yang Dec 15 '15 at 16:46
  • @YiboYang Yes, and even in the bounded case, everything will ultimately depends on whether log n can be considered small or not. – Alexandre C. Dec 15 '15 at 16:52
  • 2
    Use this solution if you need the results sorted. Otherwise use Amit's solution. – Barry Fruitman Oct 17 '16 at 21:21
  • 2
    @BarryFruitman : also use my version if you are only allowed one pass through the data, or if it does not fit in memory. – Alexandre C. Oct 17 '16 at 23:38
4

You can do this in O(n) if your elements are integers (or any integral type) within a range, i to k inclusive with k >= i. With this constraint, you can apply "bucket sort" to this.

The idea is quite simple. Allocate k - i + 1 buckets. Now, iterate through your collection and increment the bucket for that integer. Then, at the end, you can "recreate" the sorted list by creating as many integers that were found (i.e. the bucket number).

For example,

int collection[] = { 10, 4, 7, 1, 9, 0, 12 }; // maximum value to expect is 12, minimum is 0
int buckets[ 13 ] = { 0 };

for( int i = 0; i < 13; i++ )
{
      int n = collection[ i ];
      buckets[ n ]++;
}


// the first n largest elements (n = 4)

for( int j = 12; j >= 12 - 4; j-- )
{
      int n = buckets[ j ];

      while( n > 0 )
      {
           printf( "%d ", j );
           n--;
      }
}
printf( "\n" ); 
Man Vs Code
  • 1,058
  • 10
  • 14
  • please give answer this question https://stackoverflow.com/questions/68721689/how-to-find-print-n-largest-element-of-array – S.G Aug 13 '21 at 10:41
1

You can use a Priority Queue using Heap (maxHeap) to solve this. Perform heap n times to get the first n largest elements. Each Heap operation takes O(log N) time, so N heap operations would result in O(N log N) time.

harun_a
  • 620
  • 1
  • 7
  • 20
  • Something wrong with the complexity analysis here (specifically: incorrectly using the same variable N for size of input and size of output) – Ben Voigt Jul 21 '23 at 14:25
1

Use a modified version of Quick Sort. You do not need to actually sort the whole array. You only need to partition N elements larger than the pivot value. For more information, please read Introduction to Algorithms.

Nam Nguyen
  • 1,765
  • 9
  • 13
  • I want to do it without using any sorting algorithms!! – Poulami Sep 01 '11 at 15:33
  • 3
    Is this a homework question? That seems like a strangely specific requirement. – Vanessa MacDougal Sep 01 '11 at 15:34
  • 2
    @Poulami: Don't you think that's a detail worth mentioning in the question? – Praetorian Sep 01 '11 at 15:46
  • Aside from @Poulami's request, I don't think this would actually work - the choice of the pivot value for quicksort is often random. To make a non-random guess, you would need some information about the order or size of elements in advance, which you don't have in this case. – Apoorv Khatreja Apr 06 '22 at 23:02
0

I don't believe on this but you could also create a heap out of it in O(n). And then just remove the root k number of times and heapify the heap for k largest numbers. In this way for each largest numbers it will cost you log(n).

public class HeapSort1{                                                          
    public static void main(String args[]){                                  
            int[] array={5,75,1,5,4,1,2,4,8,4,2,15,4,2,1,5,779,9,1};         
            int heapsize=array.length-1;                                     
            for(int i=heapsize/2;i>=0;i--){                                  
                    maxHeapify(array,i,heapsize);                            
            }                                                                
            for(int i=heapsize;i>0;i--){                                     
                    array[i]=array[0]+array[i];                              
                    array[0]=array[i]-array[0];                              
                    array[i]=array[i]-array[0];                              
                    maxHeapify(array,0,--heapsize);                          
            }                                                                
            printArray(array);                                               
    }                                                                        
    public static void maxHeapify(int[] array,int i,int heapsize){           
            int largest=i;                                                   
            int left=2*i+1;                                                  
            int right=2*i+2;                                                 
            if(left<=heapsize && array[left]>array[i]){                      
                    largest=left;                                            
            }                                                                
            if(right<=heapsize && array[right]>array[largest]){              
                    largest=right;                                           
            }                                                                
            if(largest!=i){                                                  
                    array[i]=array[largest]+array[i];                        
                    array[largest]=array[i]-array[largest];                  
                    array[i]=array[i]-array[largest];                        
                    maxHeapify(array,largest,heapsize);                      
            }                                                                
    }                                                                        
    public static void printArray(int[] array){                              
            System.out.print("\n [");                                        
            for(int i=0;i<array.length;i++){                                 
                    System.out.print(array[i]+" ");                          
            }                                                                
            System.out.print("] \n");                                        
    }  
    public static int getMax(){
            int max=array[0];
            array[0]=array[heapsize];
            maxHeapify(array,0,--heapsize);
    }

 }                                                                                                                                                             
Yashdeep Hinge
  • 382
  • 2
  • 14
0

I tried this as per @Alexandre C.

This gets the top 10 items of a unbounded input. It breaks after it processed 20 items from the input.

import random
import time
top_10_items = []
cnt = 1
while True:
    rand = random.randint(1,100)
    print(rand)

    time.sleep(1)
    if len(top_10_items) !=10:
        top_10_items.append(rand)
    else:
        m = min(top_10_items)
        if rand > m:
            top_10_items.append(rand)
            top_10_items.remove(m)

    print(top_10_items)

    cnt+=1
    if cnt==20:
        break
user1050619
  • 19,822
  • 85
  • 237
  • 413
-5
//finding the bigest number in the array//

double big = x[0];
for(t=0;t<x[t];t++)
{
    if(x[t]>big)
    {
        big=x[t];
    }
}
printf("\nThe bigest number is    %0.2lf  \n",big);
Jesus Ramos
  • 22,940
  • 10
  • 58
  • 88