I manually resample the date (easy if it is rounding)....
Here is an example
from random import shuffle
from datetime import datetime, timedelta
from itertools import zip_longest
from random import randint, randrange, seed
from tabulate import tabulate
import pandas as pd
def df_to_md(df):
print(tabulate(df, tablefmt="pipe",headers="keys"))
seed(42)
people=['tom','dick','harry']
avg_score=[90,50,10]
date_times=[n for n in pd.date_range(datetime.now()-timedelta(days=2),datetime.now(),freq='5 min').values]
scale=1+int(len(date_times)/len(people))
score =[randint(i,100)*i/10000 for i in avg_score*scale]
df=pd.DataFrame.from_records(list(zip(date_times,people*scale,score)),columns=['When','Who','Status'])
# Create 3 records tom should score 90%, dick 50% and poor harry only 10%
# Tom should score well
df_to_md(df[df.Who=='tom'].head())
The table is in Markdown format - just to easy my cut and paste....
| | When | Who | Status |
|---:|:---------------------------|:------|---------:|
| 0 | 2019-06-18 14:07:17.457124 | tom | 0.9 |
| 3 | 2019-06-18 14:22:17.457124 | tom | 0.846 |
| 6 | 2019-06-18 14:37:17.457124 | tom | 0.828 |
| 9 | 2019-06-18 14:52:17.457124 | tom | 0.9 |
| 12 | 2019-06-18 15:07:17.457124 | tom | 0.819 |
Harry scores badly
df_to_md(df[df.Who=='harry'].head())
| | When | Who | Status |
|---:|:---------------------------|:------|---------:|
| 2 | 2019-06-18 14:17:17.457124 | harry | 0.013 |
| 5 | 2019-06-18 14:32:17.457124 | harry | 0.038 |
| 8 | 2019-06-18 14:47:17.457124 | harry | 0.023 |
| 11 | 2019-06-18 15:02:17.457124 | harry | 0.079 |
| 14 | 2019-06-18 15:17:17.457124 | harry | 0.064 |
Lets get the average per hour per person
def round_to_hour(t):
# Rounds to nearest hour by adding a timedelta hour if minute >= 30
return (t.replace(second=0, microsecond=0, minute=0, hour=t.hour)
+timedelta(hours=t.minute//30))
And generate a new column using this method.
df['WhenRounded']=df.When.apply(lambda x: round_to_hour(x))
df_to_md(df[df.Who=='tom'].head())
This should be tom's data - showing original and rounded.
| | When | Who | Status | WhenRounded |
|---:|:---------------------------|:------|---------:|:--------------------|
| 0 | 2019-06-18 14:07:17.457124 | tom | 0.9 | 2019-06-18 14:00:00 |
| 3 | 2019-06-18 14:22:17.457124 | tom | 0.846 | 2019-06-18 14:00:00 |
| 6 | 2019-06-18 14:37:17.457124 | tom | 0.828 | 2019-06-18 15:00:00 |
| 9 | 2019-06-18 14:52:17.457124 | tom | 0.9 | 2019-06-18 15:00:00 |
| 12 | 2019-06-18 15:07:17.457124 | tom | 0.819 | 2019-06-18 15:00:00 |
We can resample ... by grouping and using a grouping function
Group by the Rounded-Date, and the Person (Datetime and Str) objects) - we want in this case the mean value, but there are others also available.
df_resampled=df.groupby(by=['WhenRounded','Who'], axis=0).agg({'Status':'mean'}).reset_index()
# Output in Markdown format
df_to_md(df_resampled[df_resampled.Who=='tom'].head())
| | WhenRounded | Who | Status |
|---:|:--------------------|:------|---------:|
| 2 | 2019-06-18 14:00:00 | tom | 0.873 |
| 5 | 2019-06-18 15:00:00 | tom | 0.83925 |
| 8 | 2019-06-18 16:00:00 | tom | 0.86175 |
| 11 | 2019-06-18 17:00:00 | tom | 0.84375 |
| 14 | 2019-06-18 18:00:00 | tom | 0.8505 |
Lets check the mean for tom @ 14:00
print("Check tom 14:00 .86850 ... {:6.5f}".format((.900+.846+.828+.900)/4))
Check tom 14:00 .86850 ... 0.86850
Hope this assists