An example:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[True, False, True],
'B':[True, False, np.nan],
'C':[True, False, True]})
df.loc[[2],'C'] = np.nan
print(df, df.dtypes, sep='\n\n')
>>>
A B C
0 True True 1.0
1 False False 0.0
2 True NaN NaN
A bool
B object
C float64
dtype: object
I get that in "C"
, the datatypes get converted to float (and not int
b/c ValueError: cannot convert float NaN to integer
). But why doesn't the same happen in "B"
? Why can the bool
type column be converted into float64
when the data are edited, but a column of Booleans with some missing data can't be converted?
I've run into some operations in pandas
that I expect to work on column "A"
, but I end up needing to be more explicit about the datatype (here for example).