I have a dataframe that looks like this:
index | period | category |
---|---|---|
1 | 20181231 | 1 |
2 | 20181231 | 2 |
3 | 20181231 | 3 |
4 | 20190131 | 1 |
5 | 20190131 | 2 |
6 | 20190131 | 2 |
I want to get the following dataframe:
index | period | category | category_count | period_count |
---|---|---|---|---|
1 | 20181231 | 1 | 1 | 3 |
2 | 20181231 | 2 | 1 | 3 |
3 | 20181231 | 3 | 1 | 3 |
4 | 20190131 | 1 | 1 | 3 |
4 | 20190131 | 2 | 2 | 3 |
I tried to use various group by and aggregate logic but I always end up that period_count equals to category_count since the group by and aggregate will only aggregate through both groups (which are period and category in that case).
Is there a way to do a "nested" group by where one aggregation is done through both groups and the other is done through the first one?