0
highest_medals_countries = olympics_merged.groupby(['Sport'])['Team'].value_counts()
highest_medals_countries.sort_values(ascending = False)[:10]

Output: Sport Team

Athletics   United States    3202
            Great Britain    2240
Gymnastics  United States    1939
Swimming    United States    1622
Gymnastics  France           1576
Athletics   France           1494
Gymnastics  Italy            1345
Swimming    Great Britain    1291
Athletics   Germany          1254
Gymnastics  Hungary          1242

In the above output, I am stacking the teams with the most number of medals based on sport together but when I look at the output the sports are coming up based on the value counts. How can I get rid of this and put countries together for athletics , Gymnastics, Swimming, etc?

Expected output is:

 Sport       Team         
Athletics   United States    3202
            Great Britain    2240
            France           1494
Gymnastics  United States    1939
            France           1576
            Italy            1345
            Hungary          1242
Swimming    United States    1622  
            Great Britain    1291    
Athletics   Germany          1254
Jammy
  • 33
  • 4

1 Answers1

0

By running sort_values on your stacked dataframe you force it to sort the entire dataframe by value whereas the values were already sorted within the categories in the first place. So don't run highest_medals_countries.sort_values(ascending = False)[:10] and you're fine.

RJ Adriaansen
  • 9,131
  • 2
  • 12
  • 26
  • But if I remove that I get values for all sports. But I need values for specific sports. How can I do it? – Jammy Mar 02 '20 at 14:31
  • The easiest solution is to drop the rows with the sports you don't need prior to running the groupby. Alternatively you could use one of [these](https://stackoverflow.com/questions/25224545/filtering-multiple-items-in-a-multi-index-python-panda-dataframe) solutions after the groupby. – RJ Adriaansen Mar 03 '20 at 15:15