2

I have the following df, which is adapted from here

import pandas as pd    
df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})

I would like to have the following result:

df1 = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[0,2,3,0,2,3,4,3,0]})

The logic is (1) to select top 2 numbers for each group and (2) to set other values to 0.

Any suggestions? Thanks.

kevin
  • 1,914
  • 4
  • 25
  • 30

4 Answers4

2

Use nlargest

df.loc[~df.index.isin(df.groupby('group')['value']\
                        .apply(lambda s: s.nlargest(2))\
                        .index\
                        .get_level_values(1)), 'value'] = 0

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0
rafaelc
  • 57,686
  • 15
  • 58
  • 82
2

Using nlargest per group for boolean indexing:

m = df.groupby('group')['value'].apply(lambda g: g.isin(g.nlargest(2)))

df['value'] = df['value'].where(m, 0)

Output (as new column "value2" for clarity)

   group  value  value2
0      1      1       0
1      1      2       2
2      1      3       3
3      2      1       0
4      2      2       2
5      2      3       3
6      3      4       4
7      3      3       3
8      3      2       0
mozway
  • 194,879
  • 13
  • 39
  • 75
1

You can use a combination of where() and cumcount() to get the results you are expecting

df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})
df.where(df.sort_values('value', ascending = False).groupby('group')['value'].cumcount() + 1 <= 2, 0)
ArchAngelPwn
  • 2,891
  • 1
  • 4
  • 17
0
def function1(dd:pd.DataFrame):
    return dd.assign(value=np.where(dd.index.isin(dd.nlargest(2,'value').index),dd.value,0))

df.groupby('group').apply(function1)

out

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0
G.G
  • 639
  • 1
  • 5