I have a data set d
that contains missing values in different forms:
d = {'col1': [1, 2, '', 'N/A', 'unknown', None],
'col2': [3, 4, 'N/A', None, 'N/A_N/A', '']}
d = pd.DataFrame(data=d)
col1 col2
0 1 3
1 2 4
2 N/A
3 N/A None
4 unknown N/A_N/A
5 None
I want to see how many values are actually missing in each column. Therefore I want to convert all empty spaces, n/a and unknowns to be None
. I tried this code and got the following result:
d.replace(to_replace =['N/A', '', 'unknown', 'N/A_N/A'],
value = None)
col1 col2
0 1 3
1 2 4
2 2 4
3 2 None
4 2 None
5 None None
I don't understand why d.replace
did this, anyone have a better solution to my problem? I would like it to be like:
col1 col2
0 1 3
1 2 4
2 None None
3 None None
4 None None
5 None None