You need to attach a group number to each continuous set of rows. Some tricky logic will show that the following is accomplished with the following statement. The logic is based on using diff
to find each occurrence of a 1 preceded by a 0.
# create fake data
import pandas as pd
import numpy as np
np.random.seed(2)
df = pd.DataFrame({'cond':np.random.choice([True, False], 100),
'data':np.random.rand(100)})
print(df.head(15))
cond data
0 True 0.544208
1 False 0.082095
2 False 0.366342
3 True 0.850851
4 True 0.406275
5 False 0.027202
6 True 0.247177
7 False 0.067144
8 True 0.993852
9 False 0.970580
10 True 0.800258
11 False 0.601817
12 False 0.764960
13 False 0.169225
14 False 0.293023
# add a group column
df['group'] = ((np.where(df.cond.diff() > 0, 1, 0) * df.cond).cumsum() +
df.cond.iloc[0]) * df.cond
print(df.head())
cond data group
0 True 0.544208 1
1 False 0.082095 0
2 False 0.366342 0
3 True 0.850851 2
4 True 0.406275 2
5 False 0.027202 0
6 True 0.247177 3
7 False 0.067144 0
8 True 0.993852 4
9 False 0.970580 0
10 True 0.800258 5
11 False 0.601817 0
12 False 0.764960 0
13 False 0.169225 0
14 False 0.293023 0
# get answer with grouping and sorting
df_final = df.query('group != 0').groupby('group')['data']\
.agg(['count', 'mean'])\
.sort_values('count', ascending=False)
print(df_final.head())
count mean
group
18 10 0.529819
11 8 0.630232
10 4 0.301558
16 4 0.215376
6 4 0.563013
# get result with .iloc
# if you care about ties you can select those rows that all have
# the third highest count
df.iloc[[2]]
count mean
group
10 4 0.301558