I have an arbitrary array with only binary values- say:
a = np.array([1,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,0,])
What would be the most efficient way to count average length of sequences of 1s array? Eg. In this example, it would be (1 + 8 + 2)/3.
I have an arbitrary array with only binary values- say:
a = np.array([1,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,0,])
What would be the most efficient way to count average length of sequences of 1s array? Eg. In this example, it would be (1 + 8 + 2)/3.
For an all numpy solution, you can use Alex Martelli's solution like so:
def runs_of_ones_array(bits):
# make sure all runs of ones are well-bounded
bounded = np.hstack(([0], bits, [0]))
# get 1 at run starts and -1 at run ends
difs = np.diff(bounded)
run_starts, = np.where(difs > 0)
run_ends, = np.where(difs < 0)
return run_ends - run_starts
>>> a=np.array([1,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,0,])
>>> b=runs_of_ones_array(a)
>>> float(sum(b))/len(b)
3.66666666667
I'm not sure easiest, but one alternative is
np.mean([len(list(v)) for k,v in itertools.groupby(a) if k])
3.6666666666666665
Explanation
groupby
groups adjacent equal values together, we filter only ones (if k
), the list comprehension [...]
creates the list of the lengths of sub sequences of ones, i.e. [1,8,2]
, and mean
computes the average value.