Pandas: replace Nan with values from one of two columns

Question

Given the following dataframe df, where df['B']=df['M1']+df['M2']:

   A    M1   M2   B
   1    1    2    3
   1    2    NaN  NaN
   1    3    6    9
   1    4    8    12
   1    NaN  10   NaN
   1    6    12   18

I want the NaN in column B to equal the corresponding value in M1 or M2 provided that the latter is not NaN:

   A    M1   M2   B
   1    1    2    3
   1    2    NaN  2
   1    3    6    9
   1    4    8    12
   1    NaN  10   10
   1    6    12   18

This answer suggested to use:

df.loc[df['B'].isnull(),'B'] = df['M1'], but the structure of this line allows to consider either M1 or M2, and not both at the same time.

Ideas on how I should change it to consider both columns?

EDIT

Not a duplicate question! For ease of understanding, I claimed that df['B']=df['M1']+df['M2'], but in my real case, df['B'] is not a sum and comes from a rather complicated computation. So I cannot apply a simple formula to df['B']: all I can do is change the NaN values to match the corresponding value in either M1 or M2.

You need `df['B']=df['M1'].add(df['M2'], fill_value=0)`, it seems it is dupe... — jezrael, Jan 09 '18 at 15:16
You can check [this](https://stackoverflow.com/q/11106823/2901002) — jezrael, Jan 09 '18 at 15:17
@FaCoffee it's a dup of a different question then, you want to use `pandas.combine_first` — Paul H, Jan 09 '18 at 15:25
Not at all! `pandas.combine_first` allows some rows to disappear, while I don't want this. — FaCoffee, Jan 09 '18 at 15:27
@FaCoffee fillna by using `df.B.fillna(df[['M2','M1']].max(1))` — BENY, Jan 09 '18 at 15:30
@FaCoffee - I think `Wen` think `df['B']= (df['M1']+ df['M2']).fillna(df[['M2','M1']].max(1))`, another solution is `df['B']= (df['M1']+ df['M2']).fillna(df[['M2','M1']].sum(1))` — jezrael, Jan 09 '18 at 15:39
@FaCoffee yeah , jez is right , you just need to assign it back `df.B=df.B.fillna(df[['M2','M1']].max(1))` — BENY, Jan 09 '18 at 15:39

score 7 · Accepted Answer · answered Jan 09 '18 at 15:45

7

Base on our discussion above in the comment

df.B=df.B.fillna(df[['M1','M2']].max(1))
df
Out[52]: 
   A   M1    M2     B
0  1  1.0   2.0   3.0
1  1  2.0   NaN   2.0
2  1  3.0   6.0   9.0
3  1  4.0   8.0  12.0
4  1  NaN  10.0  10.0
5  1  6.0  12.0  18.0

From jezrael

df['B']= (df['M1']+ df['M2']).fillna(df[['M2','M1']].sum(1))

answered Jan 09 '18 at 15:45

BENY

317,841
20
164
234

@jezrael no worry , I think op also help himself :-) – BENY Jan 09 '18 at 15:50
@Wen - ya, it is up to you, but I think the best is change from wiki to normal answer ;) Good luck! – jezrael Jan 09 '18 at 15:51

Pandas: replace Nan with values from one of two columns

1 Answers1