How to get the second largest value in Pandas Python

Question

This is my code:

maxData = all_data.groupby(['Id'])[features].agg('max')
all_data = pd.merge(all_data, maxData.reset_index(), suffixes=["", "_max"], how='left', on=['Id'])

Now Instead of getting the max value, How can I fetch the second max value in the above code (groupBy Id)

Check nlargest :https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.nlargest.html — Mohit Motwani, Jan 16 '19 at 10:39
Maybe find the solution here: [https://stackoverflow.com/questions/39066260/get-first-and-second-highest-values-in-pandas-columns](https://stackoverflow.com/questions/39066260/get-first-and-second-highest-values-in-pandas-columns) — Marcos Pires, Jan 16 '19 at 10:47

score 1 · Answer 1 · answered Jan 16 '19 at 10:59

1

You can use the nth method just after sorting the values;

maxData = all_data.sort_values("features", ascending=False).groupby(['Id']).nth(1)

Please ignore apply method as it decreases performance of code.

answered Jan 16 '19 at 10:59

Sunil Goyal

578
4
9

It may be any other error. So could you please send me code? – Sunil Goyal Jan 16 '19 at 13:51
How do I merge it with all_data? `all_data = pd.merge(all_data, maxData.reset_index(), suffixes=["", "_secondMax"], how='left', on=['Id'])` gave me this error `AttributeError: 'DataFrame' object has no attribute 'dtype'` – john doe Jan 16 '19 at 13:51
could you please share the original all_data dataframe definition? – Sunil Goyal Jan 16 '19 at 13:56
It just contain 310 float columns, and an `Id` column (Id is not unique). – john doe Jan 16 '19 at 14:11

score 0 · Answer 2 · answered Jan 16 '19 at 10:46

0

Try using nlargest

maxData = all_data.groupby(['Id'])[features].apply(lambda x:x.nlargest(2)[1]).reset_index(drop=True)

answered Jan 16 '19 at 10:46

specbug

512
5
16

1

I got this error `TypeError: nlargest() missing 1 required positional argument: 'columns'` – john doe Jan 16 '19 at 13:52

How to get the second largest value in Pandas Python

2 Answers2