0

I would like to get all the column names of the n th max values. If there is a tie I want to get all the columns. I have seen a lot of solutions on how to get all the columns of the max values or get only one of the n th, but none of how to get ALL n th.

   c1   c2  c3   c4
0  10  100  10  100
1  11  110  11  110
2  12  120  12  120 

In this example, if n = 1, in the first row I want to get c2 and c4. If n = 2, get c1 and c3. The column names could go as a new column or a list it does not matter.

To get all columns with max values I used the following:

df['idxmax'] = (df
                .eq(df.max(axis=1), axis=0)
                .stack()
                .groupby(level=0)
                .apply(lambda x: ', '.join(x.index.get_level_values(1)[x])))

source

1 Answers1

0

You can stack and perform a to double groupby using rank to identify your "nth max":

n = 2
df2 = df.join(df
 .stack()
 .groupby(level=0)
 .rank(method='dense', ascending=False)
 .loc[lambda x: x==n]
 .rename_axis((None, f'{n}_max'))
 .reset_index(1)
 .groupby(level=0)[f'{n}_max'].agg(list)
 )

Output:

   c1   c2  c3   c4     2_max
0  10  100  10  100  [c1, c3]
1  11  110  11  110  [c1, c3]
2  12  120  12  120  [c1, c3]
mozway
  • 194,879
  • 13
  • 39
  • 75