5

I am looking for a way to get both the index and the column of the maximum element in a Pandas DataFrame. Thus far, this is my code:

idx = range(0, 50, 5)
col = range(0, 50, 5)
scores = pd.DataFrame(np.zeros((len(idx), len(col))), index=idx, columns=col, dtype=float)
scores.loc[11, 16] = 5 #Assign a random element

This gives me the following DataFrame:

  | 1   6   11  16  21  26  31  36  41  46
------------------------------------------
1 | 0   0   0   0   0   0   0   0   0   0
6 | 0   0   0   0   0   0   0   0   0   0
11| 0   0   0   5   0   0   0   0   0   0
16| 0   0   0   0   0   0   0   0   0   0
21| 0   0   0   0   0   0   0   0   0   0
26| 0   0   0   0   0   0   0   0   0   0
31| 0   0   0   0   0   0   0   0   0   0
36| 0   0   0   0   0   0   0   0   0   0
41| 0   0   0   0   0   0   0   0   0   0
46| 0   0   0   0   0   0   0   0   0   0

After that, I use the unstack method:

unstacked = scores.unstack().copy()
unstacked.sort(ascending=False)

This gives me:

16  11    5
46  46    0
16  31    0
11  31    0
    36    0
...

How can I get the index and column of the maximum value? I would like to get something along the lines of an array or tuple containing (16, 11).

JNevens
  • 11,202
  • 9
  • 46
  • 72
  • Possible duplicate - http://stackoverflow.com/questions/14941261/index-and-column-for-the-max-value-in-pandas-dataframe – fixxxer Apr 30 '15 at 17:00

2 Answers2

9

You are looking for idxmax :

In [1332]: x
Out[1332]: 
   1  6  11  16  21  26  31  36  41  46
0  0  0   0   0   0   0   0   0   0   0
1  0  0   0   0   0   0   0   0   0   0
2  0  0   5   0   0   0   0   0   0   0
3  0  0   0   0   0   0   0   0   0   0
4  0  0   0   0   0   0   0   0   0   0
5  0  0   0   0   0   0   0   0   0   0
6  0  0   0   0   0   0   0   0   0   0
7  0  0   0   0   0   0   0   0   0   0
8  0  0   0   0   0   0   0   0   0   0
9  0  0   0   0   0   0   0   0   0   0

Row of the max value:

In [1337]: max(x.idxmax())
Out[1337]: 2

Column of the max value (too many maxs):

In [1359]: x.max()[x.max() == x.max(axis=1).max()].index
Out[1359]: Index([u'11'], dtype='object')
fixxxer
  • 15,568
  • 15
  • 58
  • 76
2
x.max()[x.max() == x.max(axis=1).max()].index 

This works to get the column but max(x.idxmax()) only returns the numerical maximum of the indices themselves, not the index of the maximum value in the table (just got lucky in this example because everything else is 0's). An alternative is:

s = x.max()[x.max() == x.max(index=1).max()].index
s = str(s[0])
max_index = x.idxmax()[s]
Tunaki
  • 132,869
  • 46
  • 340
  • 423
mik
  • 110
  • 8
  • 1
    This is exactly what I was looking for but I'm a little surprised there isn't a more efficient answer... like a way to just have idxmax return a number in the column case. You can potentially end up storing millions of strings when you just need integers, I don't see why anyone want this to be the default behavior... @_@ – Joseph Garvin Feb 25 '17 at 23:23
  • yes i agree. I wrote this code in a time crunch and didn't look further into this issue, but i searched a lot and found nothing. Would be ideal for idxmax() to take care of this case as well. – mik Feb 27 '17 at 15:55