My goal is to aggregate a pandas DataFrame, grouping rows by an identity field. Notably, rather than just gathering summary statistics of the group, I want to retain all the information in the DataFrame in addition to summary statistics like mean, std, etc. I have performed this transformation via a lot of iteration, but I am looking for a cleaner/more pythonic approach. Notably, there may be more or less than 2 replicates per group, but all groups will always have the same number of replicates.
Example: I would llke to translate the below format
df = pd.DataFrame([
["group1", 4, 10],
["group1", 8, 20],
["group2", 6, 30],
["group2", 12, 40],
["group3", 1, 50],
["group3", 3, 60]],
columns=['group','timeA', 'timeB'])
print(df)
group timeA timeB
0 group1 4 10
1 group1 8 20
2 group2 6 30
3 group2 12 40
4 group3 1 50
5 group3 3 60
into a df of the following format:
target = pd.DataFrame([
["group1", 4, 8, 6, 10, 20, 15],
["group2", 6, 12, 9, 30, 45, 35],
["group3", 1, 3, 2, 50, 60, 55]
], columns = ["group", "timeA.1", "timeA.2", "timeA.mean", "timeB.1", "timeB.2", "timeB.mean"])
print(target)
group timeA.1 timeA.2 timeA.mean timeB.1 timeB.2 timeB.mean
0 group1 4 8 6 10 20 15
1 group2 6 12 9 30 45 35
2 group3 1 3 2 50 60 55
Finally, it doesn't really matter what the column names are, these ones are just to make the example more clear. Thanks!
EDIT: As suggested by a user in the comments, I tried the solution from the linked Q/A without success:
df.insert(0, 'count', df.groupby('group').cumcount())
df.pivot(*df)
TypeError: pivot() takes from 1 to 4 positional arguments but 5 were given