1

I have null (nan) values in column A and would like to assign 0 to the cells in column B when a cell of the same row in column A is null.

Column B has been created as the following lambda expression :

df['col_B'] = df.apply(lambda x: x.col_A in x.col_C, axis=1)

I tried to modify it but it doesn't work and from what I read it isn't advised.

So I tried with a classic loop, it shows no error but it doesn't modify the cells in column B :

for index, row in df.iterrows():
    if row['col_A'] is None:
        df.at[index, 'col_B'] = 0

My null values appear as "nan" (not "None" or "Nan") so I'm not even sure Python considers them as real null values.

What would you advise ?

  • Possible duplicate of [How set values in pandas dataframe based on NaN values of another column?](https://stackoverflow.com/questions/37962759/how-set-values-in-pandas-dataframe-based-on-nan-values-of-another-column) – Georgy Feb 04 '19 at 15:53

1 Answers1

1

You should avoid pd.Series.apply wherever possible. That said, for the conditional assignment there are a few alternatives via Boolean series.

You can use loc:

df.loc[df['col_A'].isnull(), 'col_B'] = 0

Or mask:

df['col_B'] = df['col_B'].mask(df['col_A'].isnull(), 0)

Or np.where:

df['col_B'] = np.where(df['col_A'].isnull(), 0, df['col_B'])

If your nulls are strings, make sure you replace them first; for example:

df['col_A'] = df['col_A'].replace('Nan', np.nan)
jpp
  • 159,742
  • 34
  • 281
  • 339