0

I'm trying to convert a column with numbers like "4,3" expressed as data type string into float.

The dataframe:

<class 'pandas.core.frame.DataFrame'> Int64Index: 19147 entries, 0 to 21491 Data columns (total 13 columns): # Column
Non-Null Count Dtype
--- ------ -------------- ----- 0 PremiseID 19147 non-null int64 1 PremiseName 19147 non-null object 2 Category 19147 non-null object 3 Region
19147 non-null object 4 InhabitantsCount 19147 non-null int64 5 DistanceCityCentre 19147 non-null int64 6 FacebookLikes
19147 non-null float64 7 RatingValue 19147 non-null object 8 RatingCount 19147 non-null int64 9
OpeningSaturday 19147 non-null float64 10 ClosingSaturday
19147 non-null float64 11 DrinkItem 19147 non-null object 12 Price 19147 non-null object dtypes: float64(3), int64(4), object(6) memory usage: 2.0+ MB

I've tried:

# Convert data types
conv_dict = {
    'FacebookLikes': int,
    'RatingValue': float,
    'Price': int,
}

df = df.astype(conv_dict)

The error:

ValueError: could not convert string to float: '4,3'

Stanislav Jirák
  • 465
  • 3
  • 7
  • 17

1 Answers1

1

Create a new conversion func that will take care of this input 4,3

def float_conv(val):
   return float(val.replace(',','.'))

and use it

conv_dict = {
    'FacebookLikes': int,
    'RatingValue': float_conv,
    'Price': int,
}
balderman
  • 22,927
  • 7
  • 34
  • 52
  • Yeah, I've tried this as df.RatingValue.replace(',', '.') but with no effect. When I try your solution as proposed, I'm getting "data type not recognized". If I try to convert the column directly using float(val.replace()), I get "cannot convert the series to ". – Stanislav Jirák Aug 13 '20 at 10:40