0

This is my code:

maxData = all_data.groupby(['Id'])[features].agg('max')
all_data = pd.merge(all_data, maxData.reset_index(), suffixes=["", "_max"], how='left', on=['Id'])

Now Instead of getting the max value, How can I fetch the second max value in the above code (groupBy Id)

john doe
  • 435
  • 1
  • 5
  • 12
  • 2
    Check nlargest :https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.nlargest.html – Mohit Motwani Jan 16 '19 at 10:39
  • 1
    Maybe find the solution here: [https://stackoverflow.com/questions/39066260/get-first-and-second-highest-values-in-pandas-columns](https://stackoverflow.com/questions/39066260/get-first-and-second-highest-values-in-pandas-columns) – Marcos Pires Jan 16 '19 at 10:47

2 Answers2

1

You can use the nth method just after sorting the values;

maxData = all_data.sort_values("features", ascending=False).groupby(['Id']).nth(1)

Please ignore apply method as it decreases performance of code.

Sunil Goyal
  • 578
  • 4
  • 9
0

Try using nlargest

maxData = all_data.groupby(['Id'])[features].apply(lambda x:x.nlargest(2)[1]).reset_index(drop=True)
specbug
  • 512
  • 5
  • 16