Get N maximum values and indices along an axis in a NumPy array

Question

I think this is an easy question for experienced numpy users.

I have a score matrix. The raw index corresponds to samples and column index corresponds to items. For example,

score_matrix = 
  [[ 1. ,  0.3,  0.4],
   [ 0.2,  0.6,  0.8],
   [ 0.1,  0.3,  0.5]]

I want to get top-M indices of items for each samples. Also I want to get top-M scores. For example,

top2_ind = 
  [[0, 2],
   [2, 1],
   [2, 1]]

top2_score = 
  [[1. , 0.4],
   [0,8, 0.6],
   [0.5, 0.3]]

What is the best way to do this using numpy?

@Kasramvd The linked dup target answer for 2D doesn't keep the order from highest to lowest as needed for this question. So, I am reopening, hope that sounds okay. — Divakar, Dec 19 '16 at 10:32
@Kasramvd Well [`np.argpartition docs`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.argpartition.html) state that to keep the order, we need to provide it a sequence of ints, which I think would be range of those indices, i.e. `range(N)` inside `np.argpartition()`. Were you using any such thing there? Sorry if I missed that! — Divakar, Dec 19 '16 at 10:40
@Divakar No, but certainly still it doesn't justify that this is not a duplicate. Also I don't think that this approach makes any significant difference with `argsort` version. — Mazdak, Dec 19 '16 at 10:51
@Kasramvd That would depend a lot on the sizes. But here's a [`runtime test`](http://stackoverflow.com/a/37036444/3293881) , where `N` is significantly smaller than the axis length. — Divakar, Dec 19 '16 at 10:54

score 4 · Answer 1 · edited May 23 '17 at 12:30

Here's an approach using np.argpartition -

idx = np.argpartition(a,range(M))[:,:-M-1:-1] # topM_ind
out = a[np.arange(a.shape[0])[:,None],idx]    # topM_score

Sample run -

In [343]: a
Out[343]: 
array([[ 1. ,  0.3,  0.4],
       [ 0.2,  0.6,  0.8],
       [ 0.1,  0.3,  0.5]])

In [344]: M = 2

In [345]: idx = np.argpartition(a,range(M))[:,:-M-1:-1]

In [346]: idx
Out[346]: 
array([[0, 2],
       [2, 1],
       [2, 1]])

In [347]: a[np.arange(a.shape[0])[:,None],idx]
Out[347]: 
array([[ 1. ,  0.4],
       [ 0.8,  0.6],
       [ 0.5,  0.3]])

Alternatively, possibly slower, but a bit shorter code to get idx would be with np.argsort -

idx = a.argsort(1)[:,:-M-1:-1]

Here's a post containing some runtime test that compares np.argsort and np.argpartition on a similar problem.

score 3 · Accepted Answer · answered Nov 13 '16 at 13:10

I'd use argsort():

top2_ind = score_matrix.argsort()[:,::-1][:,:2]

That is, produce an array which contains the indices which would sort score_matrix:

array([[1, 2, 0],
       [0, 1, 2],
       [0, 1, 2]])

Then reverse the columns with ::-1, then take the first two columns with :2:

array([[0, 2],
       [2, 1],
       [2, 1]])

Then similar but with regular np.sort() to get the values:

top2_score = np.sort(score_matrix)[:,::-1][:,:2]

Which following the same mechanics as above, gives you:

array([[ 1. ,  0.4],
       [ 0.8,  0.6],
       [ 0.5,  0.3]])

score 1 · Answer 3 · answered Sep 11 '18 at 13:27

In case someone is interested in the both the values and corresponding indices without tempering with the order, the following simple approach will be helpful. Though it could be computationally expensive if working with large data since we are using a list to store tuples of value, index.

import numpy as np
values = np.array([0.01,0.6, 0.4, 0.0, 0.1,0.7, 0.12]) # a simple array
values_indices = [] # define an empty list to store values and indices
while values.shape[0]>1:
    values_indices.append((values.max(), values.argmax()))
    # remove the maximum value from the array:
    values = np.delete(values, values.argmax())

The final output as list of tuples:

values_indices
[(0.7, 5), (0.6, 1), (0.4, 1), (0.12, 3), (0.1, 2), (0.01, 0)]

score 0 · Answer 4 · answered Feb 06 '22 at 19:05

0

Easy way would be:

To get top-2 indices

np.argsort(-score_matrix)[:, :2]

To get top-2 values

-np.sort(-score_matrix)[:, :2]

answered Feb 06 '22 at 19:05

igorkf

3,159
2
22
31

Get N maximum values and indices along an axis in a NumPy array

4 Answers4

Linked

Related