3

i have below data frame:-

input-

  first_name last_name  age  preTestScore  postTestScore
0      Jason    Miller   42             4             25
1      Molly  Jacobson   52            24             94
2       Tina       Ali   36            31             57
3       Jake    Milner   24             2             62
4        Amy     Cooze   73             3             70

i want the output as:-0

Amy 73

so basically i want to find the highest value in age column and i also want the name of person with highest age.

i tried with pandas using group by as below:-

df2=df.groupby(['first_name'])['age'].max()

But with this i am getting the below output as below :

first_name
Amy      73
Jake     24
Jason    42
Molly    52
Tina     36
Name: age, dtype: int64

where as i only want

Amy 73

How shall i go about it in pandas?

moys
  • 7,747
  • 2
  • 11
  • 42
Tin Mah
  • 35
  • 3

2 Answers2

1

You can get your result with the code below

df.loc[df.age.idxmax(),['first_name','age']]

Here, with df.age.idxmax() we are getting the index of the row which has the maximum age value.

Then with df.loc[df.age.idxmax(),['first_name','age']] we are getting the columns 'first_name' & 'age' at that index.

moys
  • 7,747
  • 2
  • 11
  • 42
  • is it possible to have both solutions combine like a data frame and the maximum value at the below of dataframe – Tin Mah Mar 06 '20 at 05:19
0

This line of code should do the work

df[df['age']==df['age'].max()][['first_name','age']]

The [['first_name','age']] has the names of columns you want in the result output. Change as you want. As in this case the output will be

first_name  Age
Amy         73
Shahir Ansari
  • 1,682
  • 15
  • 21