4

Is it possible to change a column in a data frame that is float64 and holds some null values to an integer dtype? I get the following error

raise ValueError('Cannot convert NA to integer')

rick debbout
  • 439
  • 2
  • 5
  • 19
  • 1
    No, you can't represent `NaN` in integer – EdChum Oct 01 '15 at 20:05
  • See also http://stackoverflow.com/questions/17534106/what-is-the-difference-between-nan-and-none/17534682#17534682 and http://pandas-docs.github.io/pandas-docs-travis/gotchas.html#support-for-integer-na – Andy Hayden Oct 01 '15 at 21:00

1 Answers1

0

It is not possible, even if you try do some work around. Generally, NaN are more efficient when it comes to show missing values. So people try to do this, Let's check what will happen if we try same.

  1. Convert all NaN values to 0 (if your data does not have this value), if 0 is not possible in your case use a very large number in negative or positive, say 9999999999

    df['x'].dtype output: dtype('float64')
    
    df.loc[df['x'].notnull(),'x'] = 9999999999  or 
    
    df.loc[df['x'].notnull(),'x'] = 0
    
  2. Convert all non NaN values to int only.

    df['x'] = df['x'].astype('int64') converting to int64, now dtype is int64.
    
  3. Put back your NaN values:

    df.loc[df['x']==0,'x'] = np.nan
    df['x'].dtype
    

    output: dtype('float64')

Above technique can also be used to convert float column to integer column if it contains NaN and raising errors. But you will have to lose NaN anyway.

Anuj Sharma
  • 481
  • 6
  • 11