I'm trying to take subcategories and place them into columns so I can compute values for each column.
For example, what I have now is:
c1 c2 c3
0 123 Orange 12
1 123 Car 15
2 123 Blue 14
3 123 Bike 13
4 234 Red 09
5 234 Bus 04
6 234 Train 19
7 234 Purple 17
What I want instead is:
c1 c3 (colour) c3 (transport)
0 123 26 28
1 234 26 23
Obviously with about ~20,000 rows and 50 columns.
I've tried using groupby and some other functions but my brain's a bit frazzled and I'm struggling to figure out how to get it to work in the way I want to.
Right now I'm using something to the effect of:
for x in df.c1.unique:
df['c3 (colour)'] = df[(df[c1] == x) & (df[c2] == colour)].sum()
I'm aware that's a horrendously inefficient way to do it, but right now it's the only way I'm getting the table format that I want. I'm still trying to work out how to use pandas correctly so I'm a little unfamiliar with certain commands that could make it work quicker.