Creating a new column with value dependent on on other columns values

Question

Assuming I have a dataframe looking like below:

import pandas as pd
import numpy as np
d = {'Column 1': [10, 12,13,43,np.nan], 
    'Column2':[np.nan,7,np.nan,49,8]}
df = pd.DataFrame(d)

I would like to create a third column with a condition to take values from Column 2 unless they are NaNs. So looking like below:

I have found multiple topics/solutions where the condition was dependent on values in one column but could not find one where it had to provide data from more than one column.

Not sure what the "multiple topics/solutions" you found but this is a duplicate of https://stackoverflow.com/questions/19913659/pandas-conditional-creation-of-a-series-dataframe-column. — , Jan 26 '22 at 15:18

score 0 · Accepted Answer · answered Jan 26 '22 at 15:13

0

You could use mask:

df['Column3'] = df['Column2'].mask(df['Column2'].isna(), df['Column 1'])

A more generic version (uses any number of columns) would be to take the last valid value per row:

df['Column3'] = df.ffill(1).iloc[:,-1]

output:

   Column 1  Column2  Column3
0      10.0      NaN     10.0
1      12.0      7.0      7.0
2      13.0      NaN     13.0
3      43.0     49.0     49.0
4       NaN      8.0      8.0

answered Jan 26 '22 at 15:13

mozway

194,879
13
39
75

I think check NaN is not neccesary – ansev Jan 26 '22 at 15:45
1

@ansev there are many ways, I mostly provided an answer for the second option that I found more interesting (but which unfortunately did not seem to be used) – mozway Jan 26 '22 at 15:49

score 0 · Answer 2 · answered Jan 26 '22 at 15:40

0

You only need:

df['Column3'] = df['Column2'].fillna(df['Column1'])

Or:

df['Column3'] = df['Column2'].combine_first(df['Column1'])

answered Jan 26 '22 at 15:40

ansev

30,322
5
17
31

Creating a new column with value dependent on on other columns values

2 Answers2