First occurrence in numpy logics

Question

Let's say I have a numpy.ndarray:

a = np.array([0,4,10,0,11,10])

I compared this with 10.

a >= 10
# array([False, False,  True, False,  True,  True], dtype=bool)

I would like to have a single True, i.e. True only at the first occurrence.

I would like to apply this to a given axis in n-D numpy.ndarray.(say, 1000*1000*10)

a_2d = np.array([[0,4,10],[0,11,10]])
#if axis == 1: array([[False, False, True], [False, True, False]])

What I have done:

As for a 1-D array, I managed to do it by using this.

b=np.zeros(a.size)
b[np.argmax(a>=10)]=True
#b=array([ 0.,  0.,  1.,  0.,  0.,  0.])

However, I have no idea how to apply this to a large n-D array.

Daniel F · Accepted Answer · 2017-03-30T13:02:08.347

1

This one should work with no for loops, for 1D or 2D:

def firstByRow(a, f = lambda x: x >= 10):
    b = (np.cumsum(f(a), axis = -1) == 1).T
    b[1:] = b[1:] * np.equal(b[1:], np.diff((f(a)).astype(int), axis = -1).T)
    return b.T

Not sure if it would be faster than a slightly loopier code though, as it does both cumsum and diff

EDIT:

You can also do this, which is probably faster (leveraging that np.unique(return_index = True) picks the first occurrence):

def firstByAxis(a, f = lambda x: x >= 10, axis = 0):
    c = np.where(f(a))
    i = np.unique(c[axis], return_index = True)[1]
    b = np.zeros_like(a)
    b[tuple(np.take(c, i, axis = -1))] = 1
    return b

edited Mar 30 '17 at 13:02

answered Mar 30 '17 at 06:22

Daniel F

13,620
2
29
55

The second seems fantastic. Just that I think you forgot to define `b = np.zeros(a.shape)` Is there any way to add axis to this method? I believe its something to do with np.unique but I cannot manage it. – Allosteric Mar 30 '17 at 07:54
I did forget the zeros, sorry. You can change `r` to `c` as the argument of `np.unique` to do the same thing column-wise – Daniel F Mar 30 '17 at 08:12
Added an axis to the function – Daniel F Mar 30 '17 at 13:03

ssm · Answer 2 · 2017-03-30T06:52:34.517

You can try the following:

>>> import numpy as np
>>> a_2d = np.array([[0,4,10],[0,11,10]])
>>> r, c = np.where( a_2d >= 10 )
>>> mask = r+c == (r+c).min()
>>> highMask = np.zeros(np.shape(a_2d))
>>> highMask[r[mask], c[mask]] = 1
>>> highMask
    array([[ 0.,  0.,  1.],
           [ 0.,  1.,  0.]])

There is no such thing as the 'first' one in a 2D array. In a 2D array, the minimum indices will form a line on the 2D axis, the both of which will have minimum indices values. For a 3D matrix, this will be a surface, etc ..

Example of such a line would be:

 0 0 0 0 0 1
 0 0 0 0 1 0
 0 0 0 1 0 0
 0 0 1 0 0 0
 0 1 0 0 0 0
 1 0 0 0 0 0

All of which are equidistant from the [0,0] location ...

Stephen Rauch · Answer 3 · 2017-03-30T06:28:00.140

0

If you enumerate over the argmax, you can update your zeros array.

Code:

a = np.array([[0, 4, 10], [0, 11, 10]])
print(a)

b = np.zeros(a.shape)
for i, j in enumerate(np.argmax(a >= 10, axis=1)):
    b[i, j] = 1
print(b)

Results:

[[ 0  4 10]
 [ 0 11 10]]

[[ 0.  0.  1.]
 [ 0.  1.  0.]]

Using advanced indexing:

c = np.zeros(a.shape)
c[list(range(a.shape[0])), np.argmax(a >= 10, axis=1)] = 1

edited Mar 30 '17 at 06:28

answered Mar 30 '17 at 05:36

Stephen Rauch

47,830
31
106
135

Thanks for your answer. However, I would like to avoid using "for loops" (because that's the main reason I use Numpy) – Allosteric Mar 30 '17 at 05:38
This is only a loop per column. The speed up is in the argmax. – Stephen Rauch Mar 30 '17 at 05:38
I should have added that the number of columns is also rather big. – Allosteric Mar 30 '17 at 05:40

First occurrence in numpy logics

3 Answers3