0

I have a pandas data frame like:

index (datetime)     ColA (bool)    ColB
'2018-10-11 17:00'   True           0.0001
'2018-10-11 17:01'   True           0.00012
'2018-10-11 17:02'   True           0.00011
'2018-10-11 17:03'   False          0.00005
'2018-10-11 17:04'   False          0.000051
'2018-10-11 17:05'   False          0.00006
'2018-10-11 17:06'   False          0.000055
'2018-10-11 17:07'   True           0.00013
'2018-10-11 17:00'   True           0.000131

I want to go through column A, get each set of true or false values that are simultaneous, and calculate the average of the column B values respectively. I want to then set this average value as Column C in the data frame, with each row in that set given this average.

To be clear, in the example above, that would mean I need 3 averages - one for the column B values in rows 1-3, the second for rows 4-7 and the third for rows 8-9 inclusive.

Fred Johnson
  • 2,539
  • 3
  • 26
  • 52
  • `df.groupby(df.ColA.ne(df.ColA.shift()).cumsum()).ColB.mean()` – user3483203 Oct 11 '18 at 19:30
  • For the last requirement: *I want to then set this average value as Column C in the data frame, with each row in that set given this average.*, you can use transform: `df['ColC'] = df.groupby(df.ColA.ne(df.ColA.shift()).cumsum())['ColB'].transform('mean')` – sacuL Oct 11 '18 at 19:31

0 Answers0