Pandas new column returning lookup of max from groupby of several columns

Question

I have a dataframe as such, and i'm trying to generate the RESULT column, using a groupby on the Set, Subset and Subsubset columns. I tried returning idmax on perc.

| Set | Subset | Subsubset | Class | perc | RESULT |
|-----|--------|-----------|-------|------|--------|
|   1 | A      |         1 | good  |  100 | good   |
|   1 | A      |           | ok    |    0 | good   |
|   1 | A      |           | poor  |    0 | good   |
|   1 | A      |           | bad   |    0 | good   |
|   1 | A      |         2 | good  |   20 | bad    |
|   1 | A      |           | ok    |   10 | bad    |
|   1 | A      |           | poor  |   20 | bad    |
|   1 | A      |           | bad   |   50 | bad    |
|   1 | A      |         3 | good  |    0 | poor   |
|   1 | A      |           | ok    |   10 | poor   |
|   1 | A      |           | poor  |   80 | poor   |
|   1 | A      |           | bad   |   10 | poor   |
|   1 | B      |         1 | good  |   50 | good   |
|   1 | B      |           | ok    |    0 | good   |
|   1 | B      |           | poor  |    1 | good   |
|   1 | B      |           | bad   |   49 | good   |
|   1 | B      |         2 | good  |   60 | good   |
|   1 | B      |           | ok    |   10 | good   |
|   1 | B      |           | poor  |   20 | good   |
|   1 | B      |           | bad   |   10 | good   |

To clarify, the result will always be a single value (never will see a 50/50 split for example).

Sets number in the hundreds, subsets upto ZZ (very long table).

This is different to a similar question Python : Getting the Row which has the max value in groups using groupby as here i am interested in looking at grouping on MULTIPLE columns.

Possible duplicate of [Python : Getting the Row which has the max value in groups using groupby](https://stackoverflow.com/questions/15705630/python-getting-the-row-which-has-the-max-value-in-groups-using-groupby) — jose_bacoy, May 01 '19 at 14:28

score 2 · Accepted Answer · answered May 01 '19 at 14:23

2

Since you mentioned idxmax , then we using idxmax

idx=df.groupby(['Set','Subset','Subsubset'])['perc'].transform('idxmax')

df['RESULT']=df.loc[idx,'Class'].values#df.Class.reindex(idx).values

answered May 01 '19 at 14:23

BENY

317,841
20
164
234

Pandas new column returning lookup of max from groupby of several columns

1 Answers1