If statement between two or more columns in a dataframe

Question

What I am trying to do is make a simple statement that says if a column is not = 'nan', then create a new column in the dataframe and make that the value for each row.

ID1    ID2
Apple  nan
Orange nan
nan    Pear
nan    Grape

Ideally it would then look like so:

ID1    ID2    MasterID
Apple  nan    Apple
Orange nan    Orange
nan    Pear   Pear
nan    Grape  Grape

I've tried using the following:

df['MasterID'] = ''
df.loc[df['ID1'] != 'nan','MasterID'] = df['ID1']
df.loc[df['ID2'] != 'nan','MasterID'] = df['ID2']

But what it's doing is just prioritizing the last statement to undo what the second line creates. Same thing when I use numpy where statement like this:

df['MasterID'] = np.where(df['ID1'] != 'nan',
                          df['ID1'],
                          df['ID2'])

Would like to also use something where I could possibly accomplish this in the future with 3+ columns. Appreciate any guidance.

Possible duplicate of [Pandas - FillNa with another column](https://stackoverflow.com/questions/30357276/pandas-fillna-with-another-column) — nutic, May 02 '18 at 13:58

jezrael · Accepted Answer · 2018-05-02T13:55:33.933

3

First replace NaNs and then sum per rows:

df['MasterID'] = df.fillna('').sum(axis=1)

If need specify columns names:

df['MasterID'] = df[['ID1','ID2']].fillna('').sum(axis=1)

print (df)
      ID1    ID2 MasterID
0   Apple    NaN    Apple
1  Orange    NaN   Orange
2     NaN   Pear     Pear
3     NaN  Grape    Grape

edited May 02 '18 at 13:55

answered May 02 '18 at 13:50

jezrael

822,522
95
1,334
1,252

1

Ah, that is cleaner than mine. – James May 02 '18 at 13:51
apologies as I did not include in original question, but if there are other columns there, how do I specify the sum on just those 2? – tender May 02 '18 at 13:54

score 2 · Answer 2 · answered May 02 '18 at 13:57

2

You can using bfill

df['New']=df.bfill(1).iloc[:,0]
df
Out[165]: 
      ID1    ID2     New
0   Apple    NaN   Apple
1  Orange    NaN  Orange
2     NaN   Pear    Pear
3     NaN  Grape   Grape

Or df['New']=df.ID1.fillna(df.ID2)

answered May 02 '18 at 13:57

BENY

317,841
20
164
234

If statement between two or more columns in a dataframe

2 Answers2