23

Is there a built-in function or a very simple way of finding the index of n largest elements in a list or a numpy array?

K = [1,2,2,4,5,5,6,10]

Find the index of the largest 5 elements?

I count the duplicates more than once, and the output should be a list of the indices of those largest numbers

Logan Yang
  • 2,364
  • 6
  • 27
  • 43

5 Answers5

42

Maybe something like:

>>> K
[4, 5, 1, 6, 2, 5, 2, 10]
>>> sorted(range(len(K)), key=lambda x: K[x])
[2, 4, 6, 0, 1, 5, 3, 7]
>>> sorted(range(len(K)), key=lambda x: K[x])[-5:]
[0, 1, 5, 3, 7]

or using numpy, you can use argsort:

>>> np.argsort(K)[-5:]
array([0, 1, 5, 3, 7])

argsort is also a method:

>>> K = np.array(K)
>>> K.argsort()[-5:]
array([0, 1, 5, 3, 7])
>>> K[K.argsort()[-5:]]
array([ 4,  5,  5,  6, 10])
DSM
  • 342,061
  • 65
  • 592
  • 494
  • 3
    Alternatively `import heapq;heapq.nlargest(n, range(len(K)), key=lambda x: K[x])` – Makers_F Nov 30 '15 at 07:07
  • This is not optimal to sort it for very big list since it may take O(n^2), while you're supposed to solved this in a linear time. – L.DZ Apr 27 '22 at 07:56
4

Consider the following code,

 N=5
 K = [1,10,2,4,5,5,6,2]
 #store list in tmp to retrieve index
 tmp=list(K)
 #sort list so that largest elements are on the far right
 K.sort()
 #To get the 5 largest elements
 print K[-N:]
 #To get the 5th largest element
 print K[-N]
 #get index of the 5th largest element
 print tmp.index(K[-N])

If you wish to ignore duplicates, then use set() as follows,

 N=5
 K = [1,10,2,4,5,5,6,2]
 #store list in tmp to retrieve index
 tmp=list(K)
 #sort list so that largest elements are on the far right
 K.sort()
 #Putting the list to a set removes duplicates
 K=set(K)
 #change K back to list since set does not support indexing
 K=list(K)
 #To get the 5 largest elements
 print K[-N:]
 #To get the 5th largest element
 print K[-N]
 #get index of the 5th largest element
 print tmp.index(K[-N])

Hopefully one of them covers your question :)

IssamLaradji
  • 6,637
  • 8
  • 43
  • 68
1

This should work:

K = [1,2,2,4,5,5,6,10]
num = 5
print 'K %s.' % (sorted(K, reverse=True)[:num])
cforbish
  • 8,567
  • 3
  • 28
  • 32
1

For efficiency, numpy partition is much more efficient for large array. There is no need to sort fully.

"All elements smaller than the k-th element are moved before this element and all equal or greater are moved behind it. The ordering of the elements in the two partitions is undefined."

cosine
  • 31
  • 3
  • I was about to add this as an answer. What I think @cosine means is to use [np.argpartition](https://numpy.org/doc/stable/reference/generated/numpy.argpartition.html) rather than np.argsort (as in @DSM's answer). So `np.argpartition(K,-5)[-5:]` returns [3, 4, 5, 6, 7], the *indexes* of the five largest values. Using those indexes on K, `np.array(K)[ np.argpartition(K,-5)[-5:] ]` returns [ 4, 5, 5, 6, 10], the actual 5 largest values (which you could also get directly using `np.partition(K,-5)[-5:]`). – gnoodle Mar 08 '23 at 12:18
0

import headq

Then use function nlargest()

Dhawal Kapil
  • 2,584
  • 18
  • 31
JimmyC
  • 21
  • 1
    Dear JimmyC, You're right, this is the best way to do it in python. Yet your answer received downvotes. I suppose it's because there are a few ways in which your answer could be improved: (1) You wrote `headq` instead of `heapq`. (2) Please write full English sentences. (3) Please give an example use. (4) You could also link to the documentation: https://docs.python.org/3/library/heapq.html#heapq.nlargest – Stef Jan 31 '22 at 10:21