pandas reading CSV data formatted with comma for thousands separator

Question

I am trying to create a dataframe in pandas using a CSV that is semicolon-delimited, and uses commas for the thousands separator on numeric data. Is there a way to read this in so that the type of the column is float and not string?

score 33 · Accepted Answer · answered May 25 '16 at 14:20

33

Pass param thousands=',' to read_csv to read those values as thousands:

In [27]:
import pandas as pd
import io

t="""id;value
0;123,123
1;221,323,330
2;32,001"""
pd.read_csv(io.StringIO(t), thousands=r',', sep=';')

Out[27]:
   id      value
0   0     123123
1   1  221323330
2   2      32001

answered May 25 '16 at 14:20

EdChum

376,765
198
813
562

1

What does the `r` stand for, in the `thousands` field? – kotchwane Apr 26 '21 at 18:36
1

@kotchwane the `r` makes it a [raw string literal](https://stackoverflow.com/q/2081640/13138364) (and is not actually necessary in this case) – tdy Jun 27 '22 at 03:09

score 10 · Answer 2 · answered Aug 11 '20 at 00:43

10

The answer to this question should be short:

df=pd.read_csv('filename.csv', thousands=',')

answered Aug 11 '20 at 00:43

Dimanjan

563
6
13

1

with ; separator df=pd.read_csv('filename.csv', sep=";", thousands=',') – Armin Okić Jul 06 '23 at 06:42

score 2 · Answer 3 · answered May 25 '16 at 14:22

2

Take a look at the read_csv documentation there is a keyword argument 'thousands' that you can pass the ',' into. Likewise if you had European data containing a '.' for the separator you could do the same.

answered May 25 '16 at 14:22

Grr

15,553
7
65
85

pandas reading CSV data formatted with comma for thousands separator

3 Answers3

Linked

Related