I'm trying to get the distribution of grades for each rank for names in a list of data. However, I can't figure out how to get the proportion/percentage of each grade count over its rank group. Here's an example:
df.head()
name rank grade
Bob 1 A
Bob 1 A
Bob 1 B
Bob 1 C
Bob 2 B
Bob 3 C
Joe 1 C
Joe 2 B
Joe 2 B
Joe 3 A
Joe 3 B
Joe 3 B
I use grade_count = df.groupby(['name', 'rank', 'grade']).['grade'].size())
to give me the count of each grade within its (name,rank) group:
name rank grade
Bob 1 A 2
B 1
C 1
2 B 1
3 C 1
Joe 1 C 1
2 B 2
3 A 1
B 2
Now for each size calculated, I'd like to get its proportion to the (name,rank) group (i.e. what is the proportion of a grade within a rank, within a system) This is the output I'd like:
name rank grade
Bob 1 A 2 0.5 (Bob @ rank 1 had 4 grades, and 50% of them are A's)
B 1 0.25
C 1 0.25
2 B 1 1
3 C 1 1
Joe 1 C 1 1
2 B 2 1
3 A 1 0.33
B 2 0.66
I've managed to get the totals of each rank group by using rank_totals = grade_count.groupby(level[0,1]).sum()
which results in:
name rank
Bob 1 4
2 1
3 1
Joe 1 1
2 2
3 3
How can I divide the numbers from grade_count
by their corresponding rank totals in rank_totals
?