Method for finding Length of intervals between 'True' values in a Boolean Array (Efficient method)

Question

Say you have a boolean/binary array in Numpy called 'a', where 'a' is a sequence of 0s and 1s (or equivalently True/False). I want to find the distances between the 1s in 'a'.

Eg. a = [1,**0,0**,1,**0**,1,**0,0,0,0,0**,1]. Output = [2,1,5]

What is the most efficient way to evaluate this in Numpy Python? The actual dataset is of the order of 1,000,000 binary values.

Do you care about the distance between successive ones (i.e. which would be zero), or do you only care about the distance when there are zeros in-between? — DarrylG, Nov 25 '19 at 18:34

score 3 · Answer 1 · answered Nov 25 '19 at 19:04

3

Here is a numpy way of getting the result:

np.diff(np.where(np.array(a)>0))-1

answered Nov 25 '19 at 19:04

Han Wang

162
8

score 2 · Answer 2 · answered Nov 25 '19 at 17:59

2

a = [1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1]
output = []
i = 0
space = 0
for ii in a:
  if a[i] == 1:
    output.append(space)
    space = 0
  elif a[i] == 0:
    space += 1
  i += 1
print(output)

This gives a zero at the beginning, but otherwise is exactly what you want.

answered Nov 25 '19 at 17:59

PythonNerd

293
1
11

This works, but is there a more efficient way using numpy functions? – Saptarshi Soham Mohanta Nov 25 '19 at 18:06
@PhthonNerd With a = a = [0, 0, 1, 0, 1], this returns [2, 1]. Is this correct or should it be [1]? – DarrylG Nov 25 '19 at 18:40
Depends. Ask Saptarshi. – PythonNerd Nov 26 '19 at 17:51

Method for finding Length of intervals between 'True' values in a Boolean Array (Efficient method)

2 Answers2