1

I have a pandas data frame. eg:

df=
  paper id  year
0         3  1997
1         3  1999
2         3  1999
3         3  1999
4         6  1997
                so on

I want the maximum year corresponding to a paper id given as input. For example, if the paper id given is 3, I want 1999 as the answer.

How can I do this?

humble
  • 2,016
  • 4
  • 27
  • 36

1 Answers1

2

There are 2 general solutions - filter first and then get max:

s = df.loc[df['paper id'] == 3, 'year'].max()
print (s)
1999

s = df.set_index('paper id').loc[3, 'year'].max()
print (s)
1999

Or aggregate max to Series and then select by index values:

s = df.groupby('paper id')['year'].max()
print (s)
paper id
3    1999
6    1997
Name: year, dtype: int64

print (s[3])
1999
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252