Now I have a column in data which looks like this:
Column
'Star Wars: Episode I The Phantom Menace'
'Star Wars: Episode I The Phantom Menace'
NaN
'Star Wars: Episode I The Phantom Menace'
NaN
....
What I tried to do was to convert the string column into a boolean column i.e True for real value and False for NaN.
I tried to classify the value with the following command:
import numpy as np
star_wars[column] = star_wars[column].map(lambda x: True if (x != np.nan) else False)
star_wars[column].value_counts()
It returned that all the rows, either with true value and with nan value, to be true, which should not be the case.
I also tried to get the result through truthy/falsey value:
import numpy as np
star_wars[column] = star_wars[column].map(lambda x: True if (x) else False)
star_wars[column].value_counts()
But interestingly, when I use the hard code:
true_false = {
"Star Wars: Episode I The Phantom Menace": True,
np.nan: False,
}
star_wars[column] = star_wars[column].map(true_false)
Then it works.
What's the issue for my solution? Or is there any document that I should refer to regarding to this issue? Thank you for your help in advance!