0

I have a dataframe which contains a column of strings of float values (negative and positive float values) in the format .000 (3dp). This number is represented as a string in Column1 and I would like to add a column2 to the DataFrame as a float and convert the string representation of the float value to a float value preserving the 3 dp. I have had problems trying to do this and have error message of "ValueError: could not convert string to float:" Grateful for any help

Code

dataframe4['column2'] = ''
dataframe4['column2']=dataframe4['column1'].astype('float64')

#Round column1, column2 float columns to 3 decimal places

dataframe4.round({'column1': 3, 'column2': 3})
daveb
  • 3,465
  • 6
  • 23
  • 28

2 Answers2

1

I don't know if I totally understood your question but you can try

dataframe4['column2'] = dataframe4['column1'].apply(lambda x : float(x))

Edit : If there are some numbers with commas, you can try:

dataframe4['column2'] = dataframe4['column1'].apply(lambda x : float(x.replace(",","")))
GatienC
  • 13
  • 5
  • Hi GatienC. Unfortunately I tried the above and got an error message when trying to convert string to float. ValueError: could not convert string to float: '9,826.000' – daveb Jul 14 '19 at 18:37
  • @daveb can you confirm that you have a comma in these numbers? That would break your conversion – patrick Jul 14 '19 at 18:42
  • Hi Patrick, Many thanks, unfortunately there is commas in the string representation of the float number. Would it be possible to add them back in if I remove them before the conversion? – daveb Jul 14 '19 at 18:43
  • yes that should def work. Let me look at it real quick – patrick Jul 14 '19 at 18:44
  • edited my answer with a .replace , it should work (didn't see ur precedent comment, sorry) – GatienC Jul 14 '19 at 18:44
0

The problem appears to be that you have commas in your floats, e.g. '9,826.000'

You can fix like below

import re
re.sub(r",", "", "1,1000.20")
# returns '11000.20' and the below works
float(re.sub(r",", "", "1,1000.20"))
# you can e.g. use apply to apply to all your numbers in the DataFrame
df["new_col"] = df["old_col"].apply(lambda x: float(re.sub(r",", "", x)))

To still show the resulting float with commas afterwards in pandas, you can change the display setting for float as described here

IDK how you want to output these, but e.g. in the to_excel function, you can specify a float format, cf here or re-format the column before output, similar to the above. See this answer for some ideas.

patrick
  • 4,455
  • 6
  • 44
  • 61