sorting multi-column grouped by data frame

Question

I'm trying to work on this data set drinks by country and find out the mean of beer servings of each country in each continent sorted from highest to lowest.

So my result should look something like below:

South America: Venezuela 333, Brazil 245, paraguay 213

and like that for the other continents (Don't want to mix countries of different continents!)

Creating the grouped data without the sorting is quite easy like below:

ddf = pd.read_csv(drinks.csv)
grouped_continent_and_country = ddf.groupby(['continent', 'country'])
print(grouped_continent_and_country['beer_servings'].mean())

but how to do the sorting??

Thanks a lot.

score 0 · Answer 1 · answered Mar 13 '20 at 15:03

0

In this case you can just sort values by 'continent' and 'beer_servings' without applying .mean():

ddf = pd.read_csv('drinks.csv')

#sorting by continent and beer_servings columns
ddf = ddf.sort_values(by=['continent','beer_servings'], ascending=True)

#making the dataframe with only needed columns
ddf = ddf[['continent', 'country', 'beer_servings']].copy()

#exporting to csv
ddf.to_csv("drinks1.csv")

Output fragment:

continent,country,beer_servings

...

Africa,Botswana,173

Africa,Angola,217

Africa,South Africa,225

Africa,Gabon,347

Africa,Namibia,376

Asia,Afghanistan,0

Asia,Bangladesh,0

Asia,North Korea,0

Asia,Iran,0

Asia,Kuwait,0

Asia,Maldives,0

...

answered Mar 13 '20 at 15:03

stanna

98
6

Thank you for your answer, but I want to know how it's done when you have a grouped data. Besides the output for the grouped data is much cleaner. – Neek Mar 13 '20 at 18:12
may be this solution would be appropriate for your case: https://stackoverflow.com/questions/27842613/pandas-groupby-sort-within-groups – stanna Mar 13 '20 at 20:23

sorting multi-column grouped by data frame

1 Answers1