How do I get first non Other on multiple column label

Question

Here's my dataset, call them df

Id  Name   Math    Physics   Biology   Chemistry
1   Andy   A       B         A         B
2   Bert   Other   C         Other     A
3   Candy  Other   Other     A         B
4   Dony   B       A         C         B

The expected value excludes 'Other', first expected value is called as 'Grade':

Id  Name   Math    Physics   Biology   Chemistry  Grade
1   Andy   A       B         A         B          A   
2   Bert   Other   C         Other     A          C
3   Candy  Other   Other     A         B          A
4   Dony   B       A         C         B          B

What have to try so far which doesn't work? Can you list down at least 1 of your approach? — meW, Jan 08 '19 at 11:16
if you replace other by np.nan , then `df['Grade']=df.iloc[:,2:].bfill(axis=1).iloc[:,0]` — anky, Jan 08 '19 at 11:24

score 2 · Accepted Answer · answered Jan 08 '19 at 11:39

2

`mask` + `bfill`

You can mask by a Boolean dataframe, then backfill and take the first column:

df['Grade'] = df.iloc[:, 2:].mask(df.iloc[:, 2:].eq('Other')).bfill(1).iloc[:, 0]

answered Jan 08 '19 at 11:39

jpp

159,742
34
281
339

1

Nice one with the mask.eq , would have implemented if known. :) +1 from me – anky Jan 08 '19 at 11:41

yatu · Answer 2 · 2019-01-08T11:31:44.590

1

Here's a solution using justify:

df['Grade'] = justify(df.iloc[:,2:].values, invalid_val='Other')[:,0]

    Id   Name   Math Physics Biology Chemistry Grade
0   1   Andy      A       B       A         B     A
1   2   Bert  Other       C   Other         A     C
2   3  Candy  Other   Other       A         B     A
3   4   Dony      B       A       C         B     B

edited Jan 08 '19 at 11:31

answered Jan 08 '19 at 11:18

yatu

86,083
12
84
139

Dani Mesejo · Answer 3 · 2019-01-08T11:35:58.237

Use idxmax + lookup:

df['Grade'] = df.lookup(df.index, (df.iloc[:, 2:] != 'Other').idxmax(axis=1))
print(df)

Output

   Id   Name   Math Physics Biology Chemistry Grade
0   1   Andy      A       B       A         B     A
1   2   Bert  Other       C   Other         A     C
2   3  Candy  Other   Other       A         B     A
3   4   Dony      B       A       C         B     B

With idxmax you get for each row the first column index that is different than Other. Then use lookup to get the values at each cell.

score 1 · Answer 4 · answered Jan 08 '19 at 11:28

1

replace 'Other' by np.nan

>>df.replace('Other',np.nan,inplace=True)

Then :

>>df['Grade']=df.iloc[:,2:].bfill(axis=1).iloc[:,0]

Restore the Other in place of np.nan

>>df.replace(np.nan,'Other',inplace=True)

answered Jan 08 '19 at 11:28

anky

74,114
11
41
70

How do I get first non Other on multiple column label

4 Answers4

mask + bfill

`mask` + `bfill`