0

Given the following code:

# Import pandas library 
import pandas as pd 


# Data to lists. 
 data = [{'Student': 'Eric', 'Grade': 96, 'Class':'A'}, \
{'Student': 'Caden', 'Grade': 92, 'Class':'A'}, \
{'Student': 'Sam', 'Grade': 90, 'Class':'A'}, \
{'Student': 'Leon', 'Grade': 88, 'Class':'A'}, \
{'Student': 'Laura', 'Grade': 80, 'Class':'B'}, \
{'Student': 'Leann', 'Grade': 22, 'Class':'B'}, \
{'Student': 'Glen', 'Grade': 9, 'Class':'C'}, \
{'Student': 'Jack', 'Grade': 90, 'Class':'C'}, \
{'Student': 'Jill', 'Grade': 87, 'Class':'C'}, \
{'Student': 'Joe', 'Grade': 58, 'Class':'C'}, \
{'Student': 'Andrew', 'Grade': 48, 'Class':'D'}, \
{'Student': 'Travis', 'Grade': 39, 'Class':'E'}, \
{'Student': 'Henry', 'Grade': 23, 'Class':'E'}, \
{'Student': 'Chris', 'Grade': 19, 'Class':'E'}, \
{'Student': 'Jim', 'Grade': 1, 'Class':'E'}, \
{'Student': 'Sarah', 'Grade': 93, 'Class':'E'}, \
{'Student': 'Brit', 'Grade': 92, 'Class':'E'}, \
] 

# Creates DataFrame. 
 df = pd.DataFrame(data) 

 print(df.groupby('Class')['Grade'].nlargest(2))

From the dataframe, I would like to return the students' names with the top 2 grades out of each class. I would like to return two different versions of the results.

Version 1 would have all of the original columns:

enter image description here

And, Version 2 would only return the names:

enter image description here

Output (would prefer to have the aforementioned two versions):

enter image description here

2 Answers2

2

IIUC, you can sort_values, then apply head to your groupby object

df_new = df.sort_values(['Class', 'Grade'], ascending=[True, False]).groupby('Class').head(2)

[out]

  Class  Grade Student
0      A     96    Eric
1      A     92   Caden
4      B     80   Laura
5      B     22   Leann
7      C     90    Jack
8      C     87    Jill
10     D     48  Andrew
15     E     93   Sarah
16     E     92    Brit

If you need to filter for your version 2 output, just use:

df_new[['Student']]

   Student
0     Eric
1    Caden
4    Laura
5    Leann
7     Jack
8     Jill
10  Andrew
15   Sarah
16    Brit
Chris Adams
  • 18,389
  • 4
  • 22
  • 39
1

Another option replicating your process is:

df.loc[df.groupby('Class')['Grade'].nlargest(2).index.get_level_values(1)]

   Class  Grade Student
0      A     96    Eric
1      A     92   Caden
4      B     80   Laura
5      B     22   Leann
7      C     90    Jack
8      C     87    Jill
10     D     48  Andrew
15     E     93   Sarah
16     E     92    Brit
anky
  • 74,114
  • 11
  • 41
  • 70