1

Suppose I have a DataFrame like this -

ID-A ID-B ID-C        Time
 1     A    X   2022/01/01 09:00:00
 1     A    X   2022/01/01 09:10:00
 1     A    Y   2022/01/02 10:15:00
 2     B    Y   2022/01/01 11:45:00
 2     C    Y   2022/01/01 01:00:00
 2     C    Y   2022/01/01 12:00:00

I want to group by columns ID-A and ID-B, and find the count of ID-C per each group. The new time field should be start time(min value per group) and end time(max value of time per group).
Desired Dataframe -

ID-A ID-B Value       start_time             end_time  
 1    A     3     2022/01/01 09:00:00  2022/01/02 10:15:00
 2    B     1     2022/01/01 11:45:00  2022/01/01 11:45:00
 2    C     2     2022/01/01 01:00:00  2022/01/01 12:00:00
shripal mehta
  • 401
  • 4
  • 21

1 Answers1

2

Use:

(df.groupby(['ID-A', 'ID-B'], as_index=False)
   .agg(Value=('ID-C', 'size'),
        start_time=('Time', 'min'),
        end_time=('Time', 'max'),
        )
)
mozway
  • 194,879
  • 13
  • 39
  • 75