1

I want to substitute NaN values of column Title based on the values of column Idx. If Idx is equal to 1, then NaN must be substituted by 0, if Idx is equal to 0, then NaN Title must be equal to 1.

Title   Idx
NaN     0
0       1
1       0
NaN     0
NaN     1

I tried this:

df.loc[df['Title'].isnull(), 'Title'] = 0

But of course it always puts 0. How can I add the condition here?

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
Klausos Klausos
  • 15,308
  • 51
  • 135
  • 217

1 Answers1

3

You can pass any Series or column to fillna(). In this case you need to fill the missing values with the Series 1 - df['Idx'] to get the result:

>>> df
   Title  Idx
0    NaN    0
1      0    1
2      1    0
3    NaN    0
4    NaN    1

>>> df['Title'] = df['Title'].fillna(1 - df['Idx'])
>>> df
   Title  Idx
0      1    0
1      0    1
2      1    0
3      1    0
4      0    1
Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • Does this approach work if the condition includes more than 1 column, i.e. Title and Films? – Klausos Klausos Oct 03 '15 at 09:53
  • Do you mean that you want to fill in `NaN` values in more than 1 column? If so, `fillna()` works on DataFrames too so it should work. Having said that, depending on the exact results you want, it may be simpler to process each column separately. – Alex Riley Oct 03 '15 at 09:58
  • I mean, e.g. Idx == 1, if Title == 0 AND Films == 0 – Klausos Klausos Oct 03 '15 at 10:45
  • That should be possible - you just need to construct that Series and pass it to `fillna`. You might be interested in `np.where` as a means to this. For example, `np.where((df.Title == 0) & (df.Films == 0), 1, 0)` could construct that Series. – Alex Riley Oct 03 '15 at 11:36