1

Lets take np.array:

array = np.array([[1, 4], 
                 [0, 3], 
                 [2, 3]])

I use this code to find the first element in the first column, where a value is larger than threshold:

index = np.argmax(array[:, 0] > threshold)

Now taken a threshold = 1, I get the index as expected:

>>index = 2

But if I choose a value larger 2, the output is 0. This will mess up my program, because I would like to take the last value and not the first in case no element fulfills the threshold.

Is there an efficient way to get the last value of the array in that case?

EDIT:

I actually don't understand how this should help me here: Numpy: How to find first non-zero value in every column of a numpy array?

I am rather looking for something like making argmax return False instead of 0.

Comparison of solutions by @Divakar and @Achintha Ihalage

import numpy as np
import time

def first_nonzero(arr, axis, invalid_val=-1):
    mask = arr != 0
    return np.where(mask.any(axis=axis), mask.argmax(axis=axis), invalid_val)


array = np.random.rand(50000, 50000) * 10
test = array[:, 0]
threshold = 11

t1 = time.time()
index1 = np.argmax(array[:, 0] > threshold) if any(array[:, 0] > threshold) else len(array[:, 0])-1
elapsed1 = time.time() - t1

t2 = time.time()
index2 = first_nonzero(array[:, 0] > threshold, axis=0, invalid_val=len(array[:, 0])-1)
elapsed2 = time.time() - t2

print(index1, "time: ", elapsed1)
print(index2, "time: ", elapsed2)
>>49999 time:  0.012960195541381836
>>49999 time:  0.0009734630584716797

So @Divakar's solution is super fast! Thanks a lot!

Peer Breier
  • 361
  • 2
  • 13
  • 1
    For efficiency, you can do `index = mask.argmax()` // `if not mask[index]: index = special_value`. (or a `where` based ND version of this) Using the fact that `index` will point to a `False` if and only if there are no `True`s – Paul Panzer Aug 22 '19 at 13:42
  • 1
    Question states - `But if I choose a value larger 2, the output is 0. This will mess up my program, because I would like to take the last value and not the first in case no element fulfills the threshold.`. With the linked one, it solves with `first_nonzero(array[:,0]>2, axis=0, invalid_val=-1)`. Or did I misunderstand something? – Divakar Aug 22 '19 at 20:51
  • @Divakar - the problem here is that it may output 0 when the threshold is hit for the first element. That is okay and I need to get that output. But it may also output 0, if no element fulfils the criterion eg >2. Using the first_nonzero function does not cover both cases. It does not allow for a 0 output. I may be wrong however and thanks for the link, very interesting. – Peer Breier Aug 23 '19 at 07:18
  • 1
    @PeerBreier It does cover that, simply use `invalid_val` arg to whatever you want in case there's no hit. – Divakar Aug 23 '19 at 07:21
  • @Divakar You're right!!! It works! Sorry, I didn't get it when looking at it the first time. I post the solution with a runtime comparison in the question. – Peer Breier Aug 23 '19 at 07:46

2 Answers2

1

This is a slightly inefficient approach, but you can make use of numpy.argwhere

#check whether there are elements exceeding the threshold
present = np.argwhere(array[:, 0] > threshold)

if present.size == 0:
  index = len(array)-1
else:
  index = np.argmax(array[:, 0] > threshold)
Abercrombie
  • 1,012
  • 2
  • 13
  • 22
1

Try the following one-liner. You get the argmax index if the condition is satisfied by at least one of the elements. Else take the last index.

index = np.argmax(array[:,0]>threshold) if any(array[:,0]>threshold) else  len(array[:,0])-1
Achintha Ihalage
  • 2,310
  • 4
  • 20
  • 33