1

In the Dataframe named titanic, how to fill NA values in column Cabin with value B which satisfies the condition Pclass==3?

Should use where? Something like

titanic['Cabin']=titanic.where(titanic.Pclass==3).fillna('B')

Other methods are also appreciated.

@jezreal: reference

Lore
  • 1,286
  • 1
  • 22
  • 57
Bharath
  • 113
  • 1
  • 1
  • 10

1 Answers1

1

Sample:

titanic = pd.DataFrame({'Pclass':[1,3,3] * 2,
                         'Cabin':[np.nan] * 2 + ['s','d','f'] + [np.nan]})

You can select rows by condition for replacement missing values:

m = titanic.Pclass==3

titanic.loc[m, 'Cabin'] = titanic.loc[m, 'Cabin'].fillna('B')

Or you can chain both conditions with & for bitwise AND and replace B:

titanic.loc[(titanic.Pclass==3) & (titanic.Cabin.isna()), 'Cabin'] = 'B'  

Or solution with Series.where - conditions are inverted with !=3, Series.notna and | for bitwise OR:

titanic['Cabin'] = titanic['Cabin'].where((titanic.Pclass!=3) | (titanic.Cabin.notna()), 'B')
print (titanic)
   Pclass Cabin
0       1   NaN
1       3     B
2       3     s
3       1     d
4       3     f
5       3     B
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • it doesn't work. it fills 'B ' for Pclass!=3 also modifies the Pclass value.(enclosed the image in main question) – Bharath Sep 19 '19 at 07:38
  • @Bharath - hmmm, added sample data and working perfect for me. – jezrael Sep 19 '19 at 07:39
  • it works. Thank you. Btw could you please do it using 'where' as i mentioned in the question – Bharath Sep 19 '19 at 07:52
  • @Bharath - added to answer. – jezrael Sep 19 '19 at 07:55
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/199674/discussion-between-bharath-and-jezrael). – Bharath Sep 19 '19 at 07:58
  • Its working perfect. But why titanic.Pclass!=3 is used as i want Pclass==3? – Bharath Sep 19 '19 at 08:03
  • 1
    @Bharath - Because where replace values of non matched mask, it means it replace if `False`. So need invert both conditions - because replace for `False`s. Check [this](https://stackoverflow.com/questions/57866439/whats-happening-in-this-piece-of-code-from-documentation/57866486#57866486) for better explanation – jezrael Sep 19 '19 at 08:07
  • Is there any way to write multiple conditions like (if Pclass==3 then 'B' if Pclass==2 then 'X') in this code "titanic['Cabin'] = titanic['Cabin'].where((titanic.Pclass!=3) | (titanic.Cabin.notna()), 'B')" ? – Bharath Sep 19 '19 at 11:21
  • @Bharath - You can check [this](https://stackoverflow.com/questions/19913659) - `np.where` or `np.select` is best use here – jezrael Sep 19 '19 at 11:23
  • yah i'm aware of np.where .But just curious to know the implementation using the pandas method. Hope you could help – Bharath Sep 19 '19 at 11:29
  • Ya, then use `titanic['Cabin'] = titanic['Cabin'].where((titanic.Pclass!=3) | (titanic.Cabin.notna()), 'B').where((titanic.Pclass!=2) | (titanic.Cabin.notna()), 'X')` – jezrael Sep 19 '19 at 11:30