1

This piece of code speaks for itself.

import pandas as pd
import numpy as np
np.random.seed(1)

#   Python 2.7.12
#   I am up to date on either pandas, and numpy versions.
print pd.__version__ # 0.23.0
print np.__version__ # 1.14.3

arr = np.random.randint(1, 16, 10).reshape(2, 5)
print arr
"""
    [[ 6 12 13  9 10]
     [12  6  1  1  2]]
"""

df = pd.DataFrame(arr)
print df
print 'df[4].dtypes = {}'.format(df[4].dtypes)
"""
        0   1   2  3   4
    0   6  12  13  9  10
    1  12   6   1  1   2
    df[4].dtypes = int32
"""

df.iloc[1, 4] = np.nan
print 'df[4].dtypes = {}'.format(df[4].dtypes)
#   df[4].dtypes = float64

Why is this happening ?

Note : I found this problem trying to solve this thread.

IMCoins
  • 3,149
  • 1
  • 10
  • 25
  • 1
    Unfortunately integer arrays cannot hold NaNs so pandas is automatically casting that column as float. – ayhan May 30 '18 at 18:49
  • 1
    @user2285236 I'm glad I asked the question. I didn't know about this seemingly important issue. :o – IMCoins May 30 '18 at 18:53

0 Answers0