I have asked earlier this question and got some feedback however I am still stuck in some mystery where I am not able to calculate the percentage of 2 columns based on conditions. 2 columns are ‘tested population’ and ‘total population’ based on grouping ‘Year’ & ‘Gender’ and show it in new column as ‘percentage’…
Year Race Gender Tested population Total population
2017 Asian Male 345 567
2017 Hispanic Female 666 67899
2018 Native Male 333 35543
2018 Asian Female 665 78955
2019 Hispanic Female 4444 44356
2020 Native Male 3642 6799
2017 Asian Male 5467 7998
2018 Asian Female 5467 7998
2019 Hispanic Male 456 4567
Table code
df = pd.DataFrame(alldata, columns=['Year', 'Gender', 'Tested population', 'Total population'])
df2 = df.groupby(['Year', 'Gender']).agg({'Tested population': 'sum'})
pop_pcts = df2.groupby(level=0).apply(lambda x:
100 * x / float(x.sum()))
print(pop_pcts)
Output:
Tested population
Year Gender
2017 Female 10.280951
Male 89.719049
2018 Female 94.849188
Male 5.150812
2019 Female 90.693878
Male 9.306122
2020 Male 100.000000
Whereas i want data as in this format to show along with other columns as a new column 'Percentage' .
Year Race Gender Tested population Total population Percentage
2017 Asian Male 345 567 60.8466
2017 Hispanic Female 666 67899 0.98087
2018 Native Male 333 35543 0.93689
2018 Asian Female 665 78955 0.84225
2019 Hispanic Female 4444 44356 10.0189
2020 Native Male 3642 6799 53.5667
2019 Hispanic Male 456 4567 9.98467
I have gone through Pandas percentage of total with groupby and not able to fix my issues, can someone help on this