5

I have a csv file with some cells that have dollar signs (e.g. $46.5). I am forcing all the types to be numpy.float64 in the function pandas.read_csv(). It complains about ValueError: could not convert string to float: $46.5. Is there a way to handle this cleanly?

nwly
  • 1,360
  • 4
  • 16
  • 30
  • 1
    You'd have to strip and cast as a post-processing step or pass a method to `converters` param for `read_csv` – EdChum Mar 30 '16 at 21:47
  • Are you trying to use floating point numbers for currency? You probably should be using `decimal`: http://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency – Alex Taylor Mar 30 '16 at 21:53

1 Answers1

12

You can add a converter for the relevant column(s):

pd.DataFrame({'col1': ['$46.51', '$38.00', 40], 
              'col2': [1, 2, 3]}).to_csv('test_df.csv', index=False)

>>> pd.read_csv('test_df.csv', converters={'col1': lambda s: float(s.replace('$', ''))})
    col1  col2
0  46.51     1
1  38.00     2
2  40.00     3
Alexander
  • 105,104
  • 32
  • 201
  • 196