Python/Numpy: get top k largest values in a 2D matrix as a mask

Question

Let's say I have a 3x3 matrix like this:

array([[8, 6, 3],
       [6, 7, 2],
       [0, 8, 9]])

Now I want to get the top k largest values in the matrix, and create a mask from it. If the number is in the top k largest, it has value 1, else 0. Let k=2. In the example above there are one 9 and two 8, we need to take all of them, so the returned mask is like this:

array([[1, 0, 0],
       [0, 0, 0],
       [0, 1, 1]])

I have read this and that answer, and I can use the indices as the mask. However, I wonder if there is any better solution?

Better in what terms? Performance, readability, length of code? — MrPisarik, Apr 25 '21 at 09:57
`np.argpartition` is a great solution and I doubt you can find something much faster or much shorter (at least not without more provided information). — Jérôme Richard, Apr 25 '21 at 10:18
the problem with argpartition is how to handle duplicate values. See my own [post](https://stackoverflow.com/a/67253650/758174) on the linked question. — Pierre D, Apr 25 '21 at 12:58

score 1 · Accepted Answer · answered Apr 25 '21 at 12:16

How about this?

def is_topk(a, k=1):
    _, rix = np.unique(-a, return_inverse=True)
    return np.where(rix < k, 1, 0).reshape(a.shape)

Example on your array:

>>> is_topk(a, 1)
array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 1]])

>>> is_topk(a, 2)
array([[1, 0, 0],
       [0, 0, 0],
       [0, 1, 1]])

Python/Numpy: get top k largest values in a 2D matrix as a mask

1 Answers1

Linked