2

In my dataset, i have a feature (called Size) like this one:

import pandas as pd


dit={"Size" : ["0","0","5","15","10"] }
dt = pd.DataFrame(data=dit)

when i run dt.info() it gives me the below result:

Size                                     140 non-null object

However, i expect it to be int. When i try the below code:

dt.loc[:,"Size"] = dt.loc[:,"Size"].astype(int)

it complains with:

ValueError: invalid literal for int() with base 10: ' '

How can i convert Size to int?

Jeff
  • 7,767
  • 28
  • 85
  • 138

2 Answers2

2

Use pd.to_numeric() :-

dit={"Size" : ['0','0','5','15','10'] }
dt = pd.DataFrame(data=dit)
dt['Size'] = pd.to_numeric(dt['Size'])
dt.info()

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 1 columns):
Size    5 non-null int64
dtypes: int64(1)
memory usage: 120.0 bytes
Rahul charan
  • 765
  • 7
  • 15
1

Here you have to select the column to be converted, use the .values to get the array containing all values and then use astype(dtype) to convert it to integer format.

dt['Size'].values.astype(int)
terance98
  • 109
  • 5