0

Hi I have a numpy array for eg.

arr = np.random.rand(4,5)

array([[0.70733982, 0.1770464 , 0.55588376, 0.8810145 , 0.43711158],
       [0.22056565, 0.0193138 , 0.89995761, 0.75157581, 0.21073093],
       [0.22333035, 0.92795789, 0.3903581 , 0.41225472, 0.74992639],
       [0.92328687, 0.20438876, 0.63975818, 0.6179422 , 0.40596821]])

I need to find the first three largest elements in the array.I tried

arr[[-arr.argsort(axis=-1)[:, :3]]]

I also referred this question on StackOverflow which only gives indices not values

I was able to get the indices of the first three max values,but how to get its corresponding values.?

Also I tried sorting the array by converting into list like given here

But didnt give me the required result.Any Ideas?

Fasty
  • 784
  • 1
  • 11
  • 34

2 Answers2

1

You can directly use np.sort():

# np.sort sorts in ascending order
# --> we apply np.sort -arr
arr_sorted = -np.sort(-arr,axis=1)
top_three = arr_sorted[:,:3]
Nakor
  • 1,484
  • 2
  • 13
  • 23
  • Sorry I need first 3 values from each row of numpy array – Fasty Jul 12 '19 at 08:44
  • Just wanted to know waht does '-' do here? and is there a way if its list of lists? – Fasty Jul 12 '19 at 08:47
  • 1
    By default, np.sort sorts in ascending order. By reversing the sign in sort, you get the descending order. Then ,you get the right order, but with all negative numbers. So you need to revert the sign back. If you have a list of lists, you could just convert to numpy and do the same – Nakor Jul 12 '19 at 08:49
  • 1
    You can avoid the negative sign by just `top_three = np.sort(arr, axis=1)[:, -3:]` – lightalchemist Jul 12 '19 at 08:52
1

This question already has a valid accepted answer, but I just wanted to point out that using np.partition instead of np.sort will be much faster in the case of a larger array. We do still use np.sort, but only on the small subset of the array that makes up our row-wise top threes.

arr = np.random.random((10000, 10000))
top_three_fast = np.sort(np.partition(arr, -3)[:, -3:])[:, ::-1]

Timings:

In [22]: %timeit top_three_fast = np.sort(np.partition(arr, -3)[:, -3:])[:, ::-1]
1.04 s ± 8.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [23]: %timeit top_three_slow = -np.sort(-arr, axis=1)[:, :3]
6.22 s ± 111 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [24]: (top_three_slow == top_three_fast).all()
Out[24]: True
sjw
  • 6,213
  • 2
  • 24
  • 39