8

I have a 4x3 boolean numpy array, and I'm trying to return a same-sized array which is all False, except for the location of the first True value on each row of the original. So if I have a starting array of

all_bools = np.array([[False, True, True],[True, True, True],[False, False, True],[False,False,False]])
all_bools
array([[False,  True,  True], # First true value = index 1
       [ True,  True,  True], # First true value = index 0
       [False, False,  True], # First true value = index 2
       [False, False, False]]) # No True Values

then I'd like to return

[[False, True, False],
 [True, False, False],
 [False, False, True],
 [False, False, False]]

so indices 1, 0 and 2 on the first three rows have been set to True and nothing else. Essentially any True value (beyond the first on each row) from the original way have been set to False.

I've been fiddling around with this with np.where and np.argmax and I haven't yet found a good solution - any help gratefully received. This needs to run many, many times so I'd like to avoid iterating.

cs95
  • 379,657
  • 97
  • 704
  • 746
Chris J Harris
  • 1,597
  • 2
  • 14
  • 26

2 Answers2

15

You can use cumsum, and find the first bool by comparing the result with 1.

all_bools.cumsum(axis=1).cumsum(axis=1) == 1 
array([[False,  True, False],
       [ True, False, False],
       [False, False,  True],
       [False, False, False]])

This also accounts for the issue @a_guest pointed out. The second cumsum call is needed to avoid matching all False values between the first and second True value.


If performance is important, use argmax and set values:

y = np.zeros_like(all_bools, dtype=bool)
idx = np.arange(len(x)), x.argmax(axis=1)
y[idx] = x[idx]

y
array([[False,  True, False],
       [ True, False, False],
       [False, False,  True],
       [False, False, False]])

Perfplot Performance Timings
I'll take this opportunity to show off perfplot, with some timings, since it is good to see how our solutions vary with different sized inputs.

import numpy as np
import perfplot

def cs1(x):
    return  x.cumsum(axis=1).cumsum(axis=1) == 1 

def cs2(x):
    y = np.zeros_like(x, dtype=bool)
    idx = np.arange(len(x)), x.argmax(axis=1)
    y[idx] = x[idx]
    return y

def a_guest(x):
    b = np.zeros_like(x, dtype=bool)
    i = np.argmax(x, axis=1)
    b[np.arange(i.size), i] = np.logical_or.reduce(x, axis=1)
    return b

perfplot.show(
    setup=lambda n: np.random.randint(0, 2, size=(n, n)).astype(bool),
    kernels=[cs1, cs2, a_guest],
    labels=['cs1', 'cs2', 'a_guest'],
    n_range=[2**k for k in range(1, 8)],
    xlabel='N'
)

enter image description here

The trend carries forward to larger N. cumsum is very expensive, while there is a constant time difference between my second solution, and @a_guest's.

cs95
  • 379,657
  • 97
  • 704
  • 746
4

You can use the following approach using np.argmax and a product with np.logical_or.reduce for dealing with rows that are all False:

b = np.zeros_like(a, dtype=bool)
i = np.argmax(a, axis=1)
b[np.arange(i.size), i] = np.logical_or.reduce(a, axis=1)

Timing results

Different versions in increasing performance, i.e. fastest approach comes last:

In [1]: import numpy as np

In [2]: def f(a):
   ...:     return a.cumsum(axis=1).cumsum(axis=1) == 1
   ...: 
   ...: 

In [3]: def g(a):
   ...:     b = np.zeros_like(a, dtype=bool)
   ...:     i = np.argmax(a, axis=1)
   ...:     b[np.arange(i.size), i] = np.logical_or.reduce(a, axis=1)
   ...:     return b
   ...: 
   ...: 

In [4]: x = np.random.randint(0, 2, size=(1000, 1000)).astype(bool)

In [5]: %timeit f(x)
10.4 ms ± 155 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: %timeit g(x)
120 µs ± 184 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [7]: def h(a):
   ...:     y = np.zeros_like(x)
   ...:     idx = np.arange(len(x)), x.argmax(axis=1)
   ...:     y[idx] += x[idx]
   ...:     return y
   ...: 
   ...: 

In [8]: %timeit h(x)
92.1 µs ± 3.51 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [9]: def h2(a):
    ...:     y = np.zeros_like(x)
    ...:     idx = np.arange(len(x)), x.argmax(axis=1)
    ...:     y[idx] = x[idx]
    ...:     return y
    ...: 
    ...: 

In [10]: %timeit h2(x)
78.5 µs ± 353 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
a_guest
  • 34,165
  • 12
  • 64
  • 118