0

I am reading a dataframe from excel. Such a sheet contains empty values.

I want to convert all the values (numbers to int) but this can not be done directly because the nan values.

this is a possible way around:

convert into int data in pandas

import pandas as pd
import numpy as np

ind = list(range(5))
values = [1.0,np.nan,3.0,4.0,5.0]
df5 = pd.DataFrame(index=ind, data={'users':values})
df5

then transform the nan to -1 which is an int

df5 = df5.replace(np.nan,-1)
df5 = df5.astype('int')
df5 = df5.replace(-1, np.nan)

but this operation transform again the data into float.

Why? how should I do it?

I dont want to have comma values, i.e. decimals, since "users" are persons.

JFerro
  • 3,203
  • 7
  • 35
  • 88

1 Answers1

0

Check out https://stackoverflow.com/a/51997100/11103175. There is a functionality to keep it as a NaN value by using dtype 'Int64'.

You can specify the dtype when you create the dataframe or after the fact

import pandas as pd
import numpy as np

ind = list(range(5))
values = [1.0,np.nan,3.0,4.0,5.0]
df5 = pd.DataFrame(index=ind, data={'users':values},dtype='Int64')
#df5 = df5.astype('Int64')
df5

Giving:

   users
0      1
1   <NA>
2      3
3      4
4      5
J4FFLE
  • 172
  • 10