How do I extract the top 3 results from each key in a Python DataFrame?

Asked Apr 18 '19 at 12:02

Active Apr 21 '19 at 10:22

Viewed 149 times

I'm currently analysing a set of data and have columns named "Genre", "Actor" and "Rating".

With Genre as the main key, how should I go about acquiring the top 3 rated actors from each genre?

I've tried to initially sort the ratings in descending order first and then subsequently, taking the top 3 highest rated actors. However, this results in a jumbled mess when trying to parse it through


data1 = data.groupby(["genre0", "Actor0"])[["Rating"]].mean().sort_values("Rating", ascending=False)
data2 = data.groupby(["genre0", "Actor1"])[["Rating"]].mean().sort_values("Rating", ascending=False)
data3 = data.groupby(["genre0", "Actor2"])[["Rating"]].mean().sort_values("Rating", ascending=False)


allgenres = pd.concat([data1, data2, data3])

allgenres.groupby("genre0").head(3)

The output should reflect:

(Genre) - (Actors Names) - (Top 3 Rated Actors)

E.g.

Action - Actor A - 10
         Actor B - 9
         Actor C - 8

Animation - Actor D - 10
            Actor E - 9
            Actor F - 8

I do not wish to reset the index and keep the ordering as it is, but group the results together.

My Results

edited Apr 21 '19 at 10:22

asked Apr 18 '19 at 12:02

Transit

You can try to sort, then drop duplicates (setting keep='first') by key and then take the first three rows. – Raf Apr 18 '19 at 12:06
For my current work, I do not mind duplicates though! However, how do I combine the genres from my picture? – Transit Apr 18 '19 at 12:09
@tabbakhh read that but it isn’t, sadly :( – Transit Apr 18 '19 at 13:41
Possible duplicate of [Pandas get topmost n records within each group](https://stackoverflow.com/questions/20069009/pandas-get-topmost-n-records-within-each-group) – Raidri Apr 18 '19 at 14:48
Please show your input data in a reproducible way. – Franco Piccolo Apr 21 '19 at 11:56

How do I extract the top 3 results from each key in a Python DataFrame?

0 Answers0