1

I am working on a data set that was imported into a Jupyter notebook from an excel file. The original file had a column with True and False values. When converted into a data frame, these values turned into 0 & 1 of type float64. The column had some missing values, too.

I tried converting them back to boolean type using

.astype('bool')

Weird enough, I found out that the missing data was converted to True.

Why did this happen??!!

I tried avoiding this by selecting only notnull() values but the type changed to Object not boolean

Michael
  • 43
  • 1
  • 6

1 Answers1

1

This is because everything evaluates to True and empty strings evaluate to False. For example, you can try the following:

List= ['True','False','False','False']
df= pd.DataFrame(List)

Then you can use map to switch the values correctly:

df= df[0].map({'False':False, 'True':True})

In your case, you have nan values. Therefore:

import numpy as np
import pandas as pd

List = ['True','False','False','True','True',np.nan,'False']
df = pd.DataFrame(List)
df = df[0].map({'False':False, 'True':True, np.nan:""})
df

Output:

0     True
1    False
2    False
3     True
4     True
5         
6    False
Name: 0, dtype: object
Marios
  • 26,333
  • 8
  • 32
  • 52
  • I think this applies only when I have True & False. I had 0 & 1 and wanted to convert them to true and false, then all the empty values turned to true, so I can't figure out which was empty/null to map – Michael Aug 07 '20 at 00:29