I'm trying to extract sequences from an array b for which a boolean array a is used as index (len(a) >= len(b)
, but (a==True).sum() == len(b)
, i.e. there are only as many values true in a than there are elements in b). The sequences should be represented in the result as start and end index of a where a[i]
is true and for which there are consecutive values.
For instance, for the following arrays of a and b
a = np.asarray([True, True, False, False, False, True, True, True, False])
b = [1, 2, 3, 4, 5]
the result should be [((0, 1), [1, 2]), ((5, 7), [3, 4, 5])]
, so as many elements in the array as there are true-sequences. Each true sequence should contain the start and end index from a and the values these relate to from b).
So for the above:
[
((0, 1), [1, 2]), # first true sequence: starting at index=0 (in a), ending at index=1, mapping to the values [1, 2] in b
((5, 7), [3, 4, 5]) # second true sequence: starting at index=5, ending at index=7, with values in b=[3, 4, 5]
]
How can this be done efficiently in numpy?