tl; dr
Is it possible to convert all float values in this DataFrame pivot table from seconds to intervals in the HH:MM:SS
format?
Name Student A Tutor
Date
2021-04-12 52.0 86.0
2021-04-13 13.0 113.0
2021-04-14 NaN 34.0
Average 32.5 77.6
Desired end result:
Name Student A Tutor
Date
2021-04-12 00:00:52 00:01:26
2021-04-13 00:00:13 00:01:53
2021-04-14 00:00:34
Average 00:00:32 00:01:17
Caveats:
- Tutor and Student A columns are dynamic so I cannot reference them by names. In fact, there might be multiple students in the table.
- I have had no luck with applying
pd.to_timedelta
as it prefixes everything with "0 days". - The table also has the total column and row but I've removed it for clarity.
Long read
This is how the pivot table has been built:
import pandas as pd
df = pd.DataFrame({
'Date': ['2021-04-12', '2021-04-12', '2021-04-13', '2021-04-13', '2021-04-14'],
'Name': ['Tutor', 'Student A', 'Student A', 'Tutor', 'Tutor'],
'duration_seconds': [86, 52, 13, 113, 34]
})
df
looks like this:
Date Name duration_seconds
0 2021-04-12 Tutor 86
1 2021-04-12 Student A 52
2 2021-04-13 Student A 13
3 2021-04-13 Tutor 113
4 2021-04-14 Tutor 34
Grouping it
grouped = df.groupby(['Name', 'Date']).sum()
yields:
duration_seconds
Name Date
Student A 2021-04-12 52
2021-04-13 13
Tutor 2021-04-12 86
2021-04-13 113
2021-04-14 34
Converting data to the desired format
pivoted = grouped.pivot_table(
values='duration_seconds',
index='Date',
columns='Name'
)
yields:
Name Student A Tutor
Date
2021-04-12 52.0 86.0
2021-04-13 13.0 113.0
2021-04-14 NaN 34.0
Adding the averages row
pivoted.loc['Average'] = pivoted.mean()
yields:
Name Student A Tutor
Date
2021-04-12 52.0 86.000000
2021-04-13 13.0 113.000000
2021-04-14 NaN 34.000000
Average 32.5 77.666667
This is how far I've got with this. The last step would be to convert all values under the student and tutor columns to intervals in the HH:MM:SS
format.
This would be a good way to achieve it had it not been prefixing everything with 0 days
:
pivoted.iloc[:].apply(pd.to_timedelta, unit='s')