1

What I am trying to do is make a simple statement that says if a column is not = 'nan', then create a new column in the dataframe and make that the value for each row.

ID1    ID2
Apple  nan
Orange nan
nan    Pear
nan    Grape

Ideally it would then look like so:

ID1    ID2    MasterID
Apple  nan    Apple
Orange nan    Orange
nan    Pear   Pear
nan    Grape  Grape

I've tried using the following:

df['MasterID'] = ''
df.loc[df['ID1'] != 'nan','MasterID'] = df['ID1']
df.loc[df['ID2'] != 'nan','MasterID'] = df['ID2']

But what it's doing is just prioritizing the last statement to undo what the second line creates. Same thing when I use numpy where statement like this:

df['MasterID'] = np.where(df['ID1'] != 'nan',
                          df['ID1'],
                          df['ID2'])

Would like to also use something where I could possibly accomplish this in the future with 3+ columns. Appreciate any guidance.

tender
  • 71
  • 6
  • Possible duplicate of [Pandas - FillNa with another column](https://stackoverflow.com/questions/30357276/pandas-fillna-with-another-column) – nutic May 02 '18 at 13:58

2 Answers2

3

First replace NaNs and then sum per rows:

df['MasterID'] = df.fillna('').sum(axis=1)

If need specify columns names:

df['MasterID'] = df[['ID1','ID2']].fillna('').sum(axis=1)

print (df)
      ID1    ID2 MasterID
0   Apple    NaN    Apple
1  Orange    NaN   Orange
2     NaN   Pear     Pear
3     NaN  Grape    Grape
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

You can using bfill

df['New']=df.bfill(1).iloc[:,0]
df
Out[165]: 
      ID1    ID2     New
0   Apple    NaN   Apple
1  Orange    NaN  Orange
2     NaN   Pear    Pear
3     NaN  Grape   Grape

Or df['New']=df.ID1.fillna(df.ID2)

BENY
  • 317,841
  • 20
  • 164
  • 234