0

I have a panda's dataframe.

It looks like this:

   level_0  level_1      from        to
0        0        0  0.927273  0.300000
1        1        1  0.946667  0.727273
2        1        2  0.565657  0.200000
3        1        3  0.946667  0.083333
4        2        4  0.831818  1.000000
5        3        5  0.831818  0.818182
6        4        6  0.872727  0.666667
7        5        7  1.000000  0.700000
8        6        8  1.000000  1.000000
9        7        9  1.000000  0.666667

I want to output the (level_0, level_1) pairs that have the highest combined from + to scores. These are obvious for most of them, but in the case of level_0 = 1, I have three possibilities. I want the algorithm to output (1,1) because they have the highest combined from + to scores.

How do I achieve this?

Thanks in advance and my excuses for the reckless initial question.

Scott Boston
  • 147,308
  • 15
  • 139
  • 187

3 Answers3

0

Do you want:

    # this runs on the original double-indexed dataframe
    df[['from','to']].sum(1).groupby(level=0).idxmax()

Output:

level_0
0    (0, 0)
1    (1, 1)
2    (2, 4)
3    (3, 5)
4    (4, 6)
5    (5, 7)
6    (6, 8)
7    (7, 9)
dtype: object
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
0

You can use this:

df.set_index(['level_0','level_1'])\
  .assign(total_score = (df['from']+df['to']).to_numpy())['total_score']\
  .groupby(level=0).idxmax()

Output:

level_0
0    (0, 0)
1    (1, 1)
2    (2, 4)
3    (3, 5)
4    (4, 6)
5    (5, 7)
6    (6, 8)
7    (7, 9)
Name: total_score, dtype: object
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
0

The pandas way is to compute the sum of the columns, and search where that sum is equal to its maximum value.

I would use:

score = df['to'] + df['from']
print(df[score == score.max()])

With the current example, it gives :

   level_0  level_1      from        to
8        6        8  1.000000  1.000000

If the dataframe was multi_indexed like dfi = df.set_index(['level_0', 'level_1']), it is exactly the same:

scorei = dfi['from'] + dfi['to']
print(dfi[scorei == scorei.max()])

which gives:

                 from   to
level_0 level_1           
6       8         1.0  1.0
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252