0

I want to merge rows of data frame based on similar timestamp.

 import pandas as pd 

 df = pd.DataFrame([VEST,False,0.6993550658226013,2019-11-27 18:56:12.616425+05:30],
 [HELMET,True,0.8506404161453247 ,2019-11-27 18:56:12.616425+05:30],
 [HELMET,True,0.5948962569236755 ,2019-11-27 18:56:13.617801+05:30],
 [VEST,False,0.6576083898544312 ,2019-11-27 18:56:14.595118+05:30],
 [HELMET,True,0.8451269865036011 ,2019-11-27 18:56:14.595118+05:30],
 [VEST,True,0.7157155275344849 ,2019-11-27 18:56:15.625841+05:30],
 [HELMET,True,0.80693519115448 ,2019-11-27 18:56:15.625841+05:30],
 [HELMET,True,0.5428823232650757 ,2019-11-27 18:56:41.639505+05:30],
 [VEST,False,0.6302998661994934 ,2019-11-27 18:56:42.582407+05:30],
 [HELMET,True,0.8790003657341003 ,2019-11-27 18:56:42.582407+05:30],
 [VEST,False,0.44062405824661255 ,2019-11-27 18:56:44.590130+05:30],
 [HELMET,True,0.9355553388595581, 2019-11-27 18:56:44.590130+05:30 ],columns = ['Type', 'voilation', 'score', 'timestamp']) 

Is there any way to merge rows with similar type and timestamp (2-3 secs) and assign violation type based on highest score.

 df.groupby(['Type', 'timestamp'])

Groupby generates only 3 frames. Not able to figure quite what to do. Any help is appreciated.

Aravind G
  • 1
  • 1

1 Answers1

2

You can use pandas.Series.dt.round to round your timestamp to the nearest three seconds and then group,

df['rounded_timestamp'] = pd.to_datetime(df['timestamp']).dt.round('3s') 
df1 = df.groupby(['Type', 'rounded_timestamp']).agg({'score': 'max'}).reset_index()

>>>df1
    Type    rounded_timestamp   score
0   HELMET  2019-11-27 13:26:12 0.850640
1   HELMET  2019-11-27 13:26:15 0.845127
2   HELMET  2019-11-27 13:26:42 0.879000
3   HELMET  2019-11-27 13:26:45 0.935555
4   VEST    2019-11-27 13:26:12 0.699355
5   VEST    2019-11-27 13:26:15 0.715716
6   VEST    2019-11-27 13:26:42 0.630300
7   VEST    2019-11-27 13:26:45 0.440624

Shijith
  • 4,602
  • 2
  • 20
  • 34