How do I parse numbers with thousands separator in pandas read_csv?

Question

I have a CSV file with lines as follows:

"Dec 30, 2021","1,234.11","1,654.22","11,876.23","1,676,234"

I have learn from a previous post that I can use:

parse_dates=['Date']

To get the date parsed (that works). However I would like columns 2-4 as np.float64 and column 5 as int64. How can I achieve that?

I have tried this:

data = pd.read_csv("file.csv",  parse_dates=['Date'], dtype=[np.datetime64, np.float64, np.float64, np.float64, np.float64, np.int64])

but I get

TypeError: data type not understood

Does this answer your question? [pandas reading CSV data formatted with comma for thousands separator](https://stackoverflow.com/questions/37439933/pandas-reading-csv-data-formatted-with-comma-for-thousands-separator) — BigBen, Jan 03 '22 at 17:03

score 3 · Accepted Answer · answered Jan 03 '22 at 16:35

3

Use thousands parameter.

df = pd.read_csv("file.csv",  parse_dates=['Date'], thousands=',')

answered Jan 03 '22 at 16:35

Emma

8,518
1
18
35

Corralien · Answer 2 · 2022-01-03T16:40:27.870

Use converters parameter if you have special format.

converters = {
    'Date': lambda x: datetime.strptime(x, "%b %d, %Y"),
    'Number': lambda x: float(x.replace(',', ''))
}
df = pd.read_csv('data.csv', converters=converters)

Output:

>>> df
        Date   Number
0 2021-12-30  2345.55

>>> df.dtypes
Date      datetime64[ns]
Number           float64
dtype: object

# data.csv
Date,Number
"Dec 30, 2021","2,345.55"

Else use standard parameters:

df = pd.read_csv("data.csv",  header=None, parse_dates=[0], thousands=',', quoting=1)

Output:

>>> df
           0        1        2         3        4
0 2021-12-30  1234.11  1654.22  11876.23  1676234

>>> df.dtypes
0    datetime64[ns]
1           float64
2           float64
3           float64
4             int64
dtype: object

thanks, this is very comprehensive. I chosed `thousands=` answer as it seems more idiomatic/simple in pandas. — M.E., Jan 03 '22 at 16:52

How do I parse numbers with thousands separator in pandas read_csv?

2 Answers2