2

I have this dataframe:

data = {'one': pd.Series([1,2,3], index=['a','c','d'], dtype='i4')
        'two': pd.Series([4,7,2,2], index=['a','b','c','d'])}

pd.DataFrame(data)

I am getting following output

    one two
a   1.0 4

b   NaN 7

c   2.0 2

d   3.0 2
jpp
  • 159,742
  • 34
  • 281
  • 339

3 Answers3

2

In Pandas / NumPy, NaN is a float:

assert type(np.nan) == float

Pandas sets the dtype for a series to accommodate all values, as explained in the docs:

Note: When working with heterogeneous data, the dtype of the resulting ndarray will be chosen to accommodate all of the data involved. For example, if strings are involved, the result will be of object dtype. If there are only floats and integers, the resulting array will be of float dtype.

Since a float series can accommodate NaN and int values, while an int series cannot accommodate NaN, your series will have dtype float.

See also Why is NaN considered as a float?

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
jpp
  • 159,742
  • 34
  • 281
  • 339
2

Because of presence of NaN which is of type np.nan is float type.

Either provide some other value at index b in columne one

or you can remove it later on by using

df.one = df.one.fillna(what_ever_value)
df.one = df.one.astype(int)

but make sure first remove the NaN value.

HimanshuGahlot
  • 561
  • 4
  • 11
1

Because the NaN is in the column,

NaN is a float so,

>>> import numpy as np
>>> type(np.nan)
<class 'float'>
>>> 

it's a float because this works:

>>> float('NaN')
nan
>>> 

everything in the columns should be a float

U13-Forward
  • 69,221
  • 14
  • 89
  • 114