4

I'm trying to find the smallest non-zero value in each row of a 2d numpy array but haven't been to find an elegant solution. I've looked at some other posts but none address the exact same problem e.g. Minimum value in 2d array or Min/Max excluding zeros but in 1d array.
For example for the given array:

x = np.array([[3., 2., 0., 1., 6.], [8., 4., 5., 0., 6.], [0., 7., 2., 5., 0.]])

the answer would be:

[1., 4., 2.]
Usman Tariq
  • 159
  • 1
  • 8
  • 1
    Does this answer your question? [Find min/max of numpy array excluding zeros along an axis](https://stackoverflow.com/questions/49389804/find-min-max-of-numpy-array-excluding-zeros-along-an-axis) – mkrieger1 Aug 01 '20 at 20:34

4 Answers4

6

One way to do this is to re-assign the zeros to the np.inf, then take min per row:

np.where(x>0, x, np.inf).min(axis=1)

Output:

array([1., 4., 2.])
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • @ScottBoston, your response is so awesome!!! Loved it. Didn't know np could do this and the best part is that `x` is still intact. – Joe Ferndz Aug 04 '20 at 04:09
4

Masked arrays are designed exactly for these kind of purposes. You can leverage masking zeros from array (or ANY other kind of mask you desire) and do pretty much most of the stuff you do on regular arrays on your masked array now:

import numpy.ma as ma
mx = ma.masked_array(x, mask=x==0)
mx.min(1)

output:

[1.0 4.0 2.0]
Ehsan
  • 12,072
  • 2
  • 20
  • 33
1
# example data
x = np.array([[3., 2., 0., 1., 6.], [8., 4., 5., 0., 6.], [0., 7., 2., 5., 0.]])

# set all the values inside the maxtrix which are equal to 0, to *inf*
# np.inf represents a very large number
# inf, stands for infinity
x[x==0] = np.inf

# grep the lowest value, in each array (now that there is no 0 value anymore)
np.min(x, axis=1)
Dieter
  • 2,499
  • 1
  • 23
  • 41
  • This would be a better answer if you explained how the code you provided answers the question. – pppery Aug 01 '20 at 21:43
  • I get the logic but not quite sure how efficient this would be. Also, I would have to create a new copy of x in case I don't want to replace zeros with inf. – Usman Tariq Aug 02 '20 at 00:47
  • 1
    @UsmanTariq - it takes 497 ms, on this matrix *x = np.random.randint(low = 0, high=5, size=( 10**7 ,5))* ... so i think it will work pretty fine :) the *np.where*, is a little bit slower .. with 793 ms . - the *ma.masked_array* is the fastest with 121 ms – Dieter Aug 02 '20 at 09:56
1

I solved this way that's time complexity is o(n^2) .

import numpy as np
x = np.array([[3., 2., 0., 1., 6.], [8., 4., 5., 0., 6.], [0., 7., 2., 5., 0.]])

for i in range(len(x)) :
    small=x[i][i]
    for j in x[i] :
        if (j!=0 and j<small):
            small=j
    print(small)
mo1ein
  • 535
  • 4
  • 18
  • 2
    the effort was there, but it will be very slow. If you"re working with vectors (arrays), try to avoid loops as much as possible. – Dieter Aug 01 '20 at 21:26
  • @Dieter yeah that's right. Our main logic is the same! but I wrote the easiest way that came to my mind. – mo1ein Aug 01 '20 at 22:59
  • why would you think that your solution is o(n^2) ? - It is not because of the fact that you use 2 for loops, that it is n^2 – Dieter Aug 01 '20 at 23:39
  • Makes sense. But this is too much boilerplate code that I want to avoid and do it in a more pythonic way. Also, what are you assuming n to be when you say the complexity is n^2? – Usman Tariq Aug 02 '20 at 00:45
  • Yes that's n^2, I made a mistake . I know this way is not fast but I think maybe can helpful. – mo1ein Aug 02 '20 at 01:12