4

I have this array full of Boolean values:

array([[[ True,  True, False, False, False, False],
        [ True, False, False, False, False, False]],

       [[False, False, True, False, True, False],
        [ True, False, False, False, False, False]],

       [[ True, False, False, False, False, False],
        [ True, False, False, False, False, False]]], dtype=bool)

I want to get indexes of first occurrences of True in each column in each row so the answer would be something like that:

array([[0,0,0],
      [0,1,0],
      [1,0,2],
      [1,1,0],
      [2,0,0],
      [2,1,0]])

Is there a simple and fast way of doing so?

Midnighter
  • 3,771
  • 2
  • 29
  • 43
user1938027
  • 135
  • 7
  • Do you want the first occurrence, or all the occurrences? Your wording suggest the former whereas your sample output suggests the latter. – Midnighter May 26 '14 at 11:42
  • I want to get first occurrence in each column of each row - as soon as we find True in a column, move to the next column, look for True, switch to the next row, repeat. It is kinda hard for me to get the proper wording while working with 3d arrays, sorry. – user1938027 May 26 '14 at 11:54
  • According to [this answer](http://stackoverflow.com/a/7660322/677122) getting the first index of an item will come in numpy 2.0. – Midnighter May 26 '14 at 12:29
  • Something is amiss with your example input and output, the shape of your array is `(3, 2, 6)` and your output has shape `(6, 3)` but within the output you use indexes from 0 to 2 which is an `IndexError`. Maybe you could describe your question better by saying, loop over axis `x`, then loop over axis `y` and find first index in axis `z`. – Midnighter May 26 '14 at 12:45

3 Answers3

3

Cannot test right now, but I think this should work

arr.argmax(axis=1).T

argmax on bools shortcircuits in numpy 1.9, so it should be preferred to where or nonzero for this use case.


EDIT OK, so the above solution doesn't work, but the approach with argmax is still useful:

In [23]: mult = np.product(arr.shape[:-1])

In [24]: np.column_stack(np.unravel_index(arr.shape[-1]*np.arange(mult) +
   ....:                                  arr.argmax(axis=-1).ravel(),
   ....:                                  arr.shape))
Out[24]:
array([[0, 0, 0],
       [0, 1, 0],
       [1, 0, 2],
       [1, 1, 0],
       [2, 0, 0],
       [2, 1, 0]])
Jaime
  • 65,696
  • 17
  • 124
  • 159
1

It seems you want np.where() combined with the solution of this answer to find unique rows:

b = np.array(np.where(a)).T
#array([[0, 0, 0],
#       [0, 0, 1],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 0, 4],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)
c = b[:,:2]
d = np.ascontiguousarray(c).view(np.dtype((np.void, c.dtype.itemsize * c.shape[1])))
_, idx = np.unique(d, return_index=True)

b[idx]
#array([[0, 0, 0],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)
Community
  • 1
  • 1
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
0

Adding onto other answers, if you want it to give you the index of the first col that's True, and return n where n = # cols in a if a row doesn't contain True:

first_occs = np.argmax(a, axis=1)
all_zeros = ~a.any(axis=1).astype(int)
first_occs_modified = first_occs + all_zeros * a.shape[1]
Nick K
  • 23
  • 3