Retrieving the value of the column with the maximum values

Question

I have a panda's dataframe.

It looks like this:

   level_0  level_1      from        to
0        0        0  0.927273  0.300000
1        1        1  0.946667  0.727273
2        1        2  0.565657  0.200000
3        1        3  0.946667  0.083333
4        2        4  0.831818  1.000000
5        3        5  0.831818  0.818182
6        4        6  0.872727  0.666667
7        5        7  1.000000  0.700000
8        6        8  1.000000  1.000000
9        7        9  1.000000  0.666667

I want to output the (level_0, level_1) pairs that have the highest combined from + to scores. These are obvious for most of them, but in the case of level_0 = 1, I have three possibilities. I want the algorithm to output (1,1) because they have the highest combined from + to scores.

How do I achieve this?

Thanks in advance and my excuses for the reckless initial question.

Please [do not](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) paste the text of multi-indexed dataframe, instead, use `reset_index()` and specify which are the index columns. — Quang Hoang, Jun 12 '19 at 13:44

score 0 · Accepted Answer · edited Jun 12 '19 at 14:22

0

Do you want:

    # this runs on the original double-indexed dataframe
    df[['from','to']].sum(1).groupby(level=0).idxmax()

Output:

level_0
0    (0, 0)
1    (1, 1)
2    (2, 4)
3    (3, 5)
4    (4, 6)
5    (5, 7)
6    (6, 8)
7    (7, 9)
dtype: object

edited Jun 12 '19 at 14:22

jtremoureux

28
3

answered Jun 12 '19 at 14:14

Quang Hoang

146,074
10
56
74

score 0 · Answer 2 · answered Jun 12 '19 at 14:26

You can use this:

df.set_index(['level_0','level_1'])\
  .assign(total_score = (df['from']+df['to']).to_numpy())['total_score']\
  .groupby(level=0).idxmax()

Output:

level_0
0    (0, 0)
1    (1, 1)
2    (2, 4)
3    (3, 5)
4    (4, 6)
5    (5, 7)
6    (6, 8)
7    (7, 9)
Name: total_score, dtype: object

score 0 · Answer 3 · answered Jun 12 '19 at 14:29

The pandas way is to compute the sum of the columns, and search where that sum is equal to its maximum value.

I would use:

score = df['to'] + df['from']
print(df[score == score.max()])

With the current example, it gives :

   level_0  level_1      from        to
8        6        8  1.000000  1.000000

If the dataframe was multi_indexed like dfi = df.set_index(['level_0', 'level_1']), it is exactly the same:

scorei = dfi['from'] + dfi['to']
print(dfi[scorei == scorei.max()])

which gives:

                 from   to
level_0 level_1           
6       8         1.0  1.0

Retrieving the value of the column with the maximum values

3 Answers3