Fill values in pandas column with condition involving 2 other columns

Question

I am trying to fill this 'C' column in such a way that when the value in 'A' is not NaN, 'C' takes value from 'B', else the value in 'C' remains unchanged.

Heres the code:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': ['greek', 'indian', np.nan, np.nan, 'australian'], 'B': 
     np.random.random(5)})
df['C'] = np.nan

df

I tried df.C = df.B.where(df.A != np.nan, np.nan), but it isnt working as the condition involves another column i think, for loop isnt yielding the desired result either. How to get there using shortest lines of codes as possible?

`~df["A"].isna()` can be replaced by `df["A"].notna()` @not_speshal — ThePyGuy, Aug 06 '21 at 18:02
@ThePyGuy, thanks, is it not possible to get there using np.where? — Raj Nair, Aug 06 '21 at 18:11
`nan` does not equal `nan` by definition. `print(np.nan != np.nan) # True` Which is why comparing to NaN does not work. [Why in numpy `nan == nan` is False while nan in nan is True?](https://stackoverflow.com/q/20320022/15497888) — Henry Ecker, Aug 06 '21 at 18:15

score 0 · Accepted Answer · answered Aug 06 '21 at 18:14

The problem is not with np.where, the problem is that you are comparing the value directly against np.nan using !=

>>> np.nan == np.nan
False

So, use a function/method that allows you to check if the value is nan or not:

>>> df.C = df.B.where(df.A.notna(), np.nan)

            A         B         C
0       greek  0.030809  0.030809
1      indian  0.545261  0.545261
2         NaN  0.470802       NaN
3         NaN  0.716640       NaN
4  australian  0.148297  0.148297

Fill values in pandas column with condition involving 2 other columns

1 Answers1