I need to get sum of values from selected group of rows in DF, and after that calculate percentage for each row in a group.
df = pd.read_csv("file.csv")
names = df.groupby(["Names", "Action"]).size().reset_index(name='counts')
names.rename(columns={"counts":"Count"}, inplace=True)
Get DF like this:
Name Action Count
0 Name1 Action 1 7
1 Name1 Action 2 3
2 Name1 Action 3 720
3 Name1 Action 4 95
4 Name1 Action 5 301
5 Name1 Action 6 1
6 Name2 Action 1 17
7 Name2 Action 2 1
8 Name3 Action 1 1
9 Name3 Action 2 23
10 Name3 Action 3 170
For each Name I have different action with corresponding number. I need to count total number of all types of Actions for each Name (for Name1 it will we 1127).
After I need to add column "% of Total" to DF that will display % of each row value from total by each of given Names. Each "Name group" (Name1, Name2...) of rows inside "% of Total" column will have their own 100 % percentage.
It should look like this:
Name Action Count % of Total
0 Name1 Action 1 7 0.62
1 Name1 Action 2 3 0.27
2 Name1 Action 3 720 63.89
3 Name1 Action 4 95 8.43
4 Name1 Action 5 301 26.71
5 Name1 Action 6 1 0.09
6 Name2 Action 1 17 94.44
7 Name2 Action 2 1 5.56
8 Name3 Action 1 1 0.52
9 Name3 Action 2 23 11.86
10 Name3 Action 3 170 87.63
To get the total sum of all "Action" values for each Name I created a new DF:
df_total = df.groupby(["Name"]).size().reset_index(name='counts_new')
Df looks like this:
Name counts_new
0 Name1 1127
1 Name2 18
2 Name3 194
3 Name4 1377
4 Name5 93
5 Name6 1627
6 Name7 1355
7 etc...
Now my plan is to add column to Df with some calculations. BUT I can't understand how to write a code where: each row with Name1 will take for calculation corresponding row (with Name1) from df_total.
When I get this (I understand it's not right), it just take first row from DF and perform calculation with first row from df_total, and than second, third. Which get all wrong results.
names["% of Total"] = df["Count"] * 100 / df_total["counts_new"]
Name Action Count % of Total
0 Name1 Action 1 7 0.621118
1 Name1 Action 2 3 16.666667
2 Name1 Action 3 720 371.134021
3 Name1 Action 4 95 6.899056
4 Name1 Action 5 301 323.655914
5 Name1 Action 6 1 0.061463
6 Name2 Action 1 17 1.254613
7 Name2 Action 2 1 0.125945
Will appreciate any help. Thank you.