numpy matrix, setting 0 to values by sorting each row

Question

I have a matrix, with many rows, and 8 columns. Each cell represents a probability for the current row to belong to 1 of the 8 classes. I would like to keep only the 2 highest values in each row, and set the rest to 0.

So far, the only way I can think of is by looping and sorting each row separately. For example:

a = np.array([[ 0.2  ,  0.1  ,  0.02 ,  0.01 ,  0.031,  0.11 ],
              [ 0.5  ,  0.1  ,  0.02 ,  0.01 ,  0.031,  0.11 ],
              [ 0.2  ,  0.1  ,  0.22 ,  0.15 ,  0.031,  0.11 ]])

I would like to get:

array([[ 0.2 ,  0.  ,  0.  ,  0.  ,  0.  ,  0.11],
       [ 0.5 ,  0.  ,  0.  ,  0.  ,  0.  ,  0.11],
       [ 0.2 ,  0.  ,  0.22,  0.  ,  0.  ,  0.  ]])

Thanks,

Divakar · Accepted Answer · 2016-02-15T18:24:38.257

Here's one vectorized approach with np.argpartition -

m,n = a.shape
a[np.arange(m)[:,None],np.argpartition(a,n-2,axis=1)[:,:-2]] = 0

Sample run -

In [570]: a
Out[570]: 
array([[ 0.94791114,  0.48438182,  0.54574317,  0.45481231,  0.94013836],
       [ 0.03861196,  0.99047316,  0.7897759 ,  0.38863967,  0.93659426],
       [ 0.49436676,  0.93762758,  0.33694977,  0.45701655,  0.73078113],
       [ 0.21240062,  0.85141765,  0.00815352,  0.52517721,  0.49752736]])

In [571]: m,n = a.shape
     ...: a[np.arange(m)[:,None],np.argpartition(a,n-2,axis=1)[:,:-2]] = 0
     ...: 

In [572]: a
Out[572]: 
array([[ 0.94791114,  0.        ,  0.        ,  0.        ,  0.94013836],
       [ 0.        ,  0.99047316,  0.        ,  0.        ,  0.93659426],
       [ 0.        ,  0.93762758,  0.        ,  0.        ,  0.73078113],
       [ 0.        ,  0.85141765,  0.        ,  0.52517721,  0.        ]])

Lisa · Answer 2 · 2016-02-15T19:05:11.530

1

This should work, however, it alters a. Is this what you want? Is it essential to avoid loops?

sorted = np.sort(a, axis=1)

for idx, row in enumerate(a):
    row[row < sorted[idx,-2]] = 0

Or you could do this:

a[a < sorted[:,None,-2]] = 0

edited Feb 15 '16 at 19:05

answered Feb 15 '16 at 18:17

Lisa

3,365
3
19
30

1

Think you need `a < sorted[:,None,-2]` instead to keep the `2D` shape. Thus, it would be simply `a[a < sorted[:,None,-2]] = 0`, ignoring the tie cases. – Divakar Feb 15 '16 at 18:57
@Divakar I have to admit, I strangely didn't know about indexing with `None`, that's *very* useful knowledge - thanks! – Lisa Feb 15 '16 at 19:05
Well I am addicted to it! Hope you get to use it more often :) – Divakar Feb 15 '16 at 19:07
Certainly will do :) – Lisa Feb 15 '16 at 19:07

numpy matrix, setting 0 to values by sorting each row

2 Answers2

Linked

Related