How to use 'df.groupby' and 'df.mean()' method to calculate the means of specific groups and conditions (Python, Pandas)?

Question

You lovely, wise snake stats lovers!

Recent, I have been trying to learn python statistics on my own and often I would have this unstoppable urge of mimicking the example code and do some little tweaks.

The example code showed me how to sort out U.S. born Nobel prize winners by decade. Following that trail, I wanted to see what is the case for other countries, at least other ten top winners, and somehow I couldn't do that. (I say somehow, as if python normally agrees with me, which absolutely is not the case.).

The example code:

prop_usa_winners = nobel.groupby('decade', as_index=False)['usa_born_winner'].mean()

Results: decade usa_born_winner 0 1900 0.017544 1 1910 0.075000 2 1920 0.074074 3 1930 0.250000 4 1940 0.302326 5 1950 0.291667 6 1960 0.265823 7 1970 0.317308 8 1980 0.319588 9 1990 0.403846 10 2000 0.422764 11 2010 0.292683

My code:

nobel.groupby('decade', as_index=False)['birth_conutry'].mean()

or:

nobel.groupby('decade', as_index=False).agg(sum(['birthday_country']).mean())

None works

'Nobel' file structure:

enter image description here

What I wish to achieve:

Index decade country 1 country 2 country... or

Index(country) 1910 1920 1920 country1 .xxx .xxx .xxx

Plz, help

Thank you!

How to use 'df.groupby' and 'df.mean()' method to calculate the means of specific groups and conditions (Python, Pandas)?

0 Answers0