I have this dataframe
x = pd.DataFrame.from_dict({'cat1':['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'], 'cat2':['X', 'X', 'Y', 'Y', 'Y', 'Y', 'Z', 'Z']})
cat1 cat2
0 A X
1 A X
2 A Y
3 B Y
4 B Y
5 C Y
6 C Z
7 C Z
I want to group by cat1
, and then aggregate cat2
as sets of different values, such as
cat1 cat2
0 A (X, Y)
1 B (Y,)
2 C (Y, Z)
This is part of a bigger dataframe with more columns, each of which has its own aggregation function, so how do I pass this functionality to the aggregation dictionary?