Why column of int and NaN has float type

Question

I have this dataframe:

data = {'one': pd.Series([1,2,3], index=['a','c','d'], dtype='i4')
        'two': pd.Series([4,7,2,2], index=['a','b','c','d'])}

pd.DataFrame(data)

I am getting following output

    one two
a   1.0 4

b   NaN 7

c   2.0 2

d   3.0 2

score 2 · Answer 1 · edited Sep 07 '18 at 08:31

In Pandas / NumPy, NaN is a float:

assert type(np.nan) == float

Pandas sets the dtype for a series to accommodate all values, as explained in the docs:

Note: When working with heterogeneous data, the dtype of the resulting ndarray will be chosen to accommodate all of the data involved. For example, if strings are involved, the result will be of object dtype. If there are only floats and integers, the resulting array will be of float dtype.

Since a float series can accommodate NaN and int values, while an int series cannot accommodate NaN, your series will have dtype float.

See also Why is NaN considered as a float?

score 2 · Answer 2 · answered Sep 07 '18 at 08:26

Because of presence of NaN which is of type np.nan is float type.

Either provide some other value at index b in columne one

or you can remove it later on by using

df.one = df.one.fillna(what_ever_value)
df.one = df.one.astype(int)

but make sure first remove the NaN value.

score 1 · Answer 3 · answered Sep 07 '18 at 08:20

1

Because the NaN is in the column,

NaN is a float so,

>>> import numpy as np
>>> type(np.nan)
<class 'float'>
>>>

it's a float because this works:

>>> float('NaN')
nan
>>>

everything in the columns should be a float

answered Sep 07 '18 at 08:20

U13-Forward

69,221
14
89
114

Why column of int and NaN has float type

3 Answers3