converting currency with $ to numbers in Python pandas

Question

I have the following data in pandas dataframe:

    state        1st        2nd             3rd
0   California  $11,593,820 $109,264,246    $8,496,273
1   New York    $10,861,680 $45,336,041     $6,317,300
2   Florida     $7,942,848  $69,369,589     $4,697,244
3   Texas       $7,536,817  $61,830,712     $5,736,941

I want to perform some simple analysis (e.g., sum, groupby) with three columns (1st, 2nd, 3rd), but the data type of those three columns is object (or string).

So I used the following code for data conversion:

data = data.convert_objects(convert_numeric=True)

But, conversion does not work, perhaps, due to the dollar sign. Any suggestion?

http://stackoverflow.com/questions/27534746/importing-financial-data-into-python-pandas-using-read-csv — Padraic Cunningham, Sep 08 '15 at 18:17
For a regex solution, check out http://stackoverflow.com/a/31521773/3651127 — dagrha, Sep 08 '15 at 18:19

score 102 · Accepted Answer · edited Mar 24 '20 at 16:09

102

@EdChum's answer is clever and works well. But since there's more than one way to bake a cake.... why not use regex? For example:

df[df.columns[1:]] = df[df.columns[1:]].replace('[\$,]', '', regex=True).astype(float)

To me, that is a little bit more readable.

edited Mar 24 '20 at 16:09

abcd

10,215
15
51
85

answered Sep 08 '15 at 19:42

dagrha

2,449
1
20
21

11

A more general approach is to replace all non-digit characters, so the regular expression is `'\D'`. Note that this would also remove a decimal point, so only good for integer values. To remove all characters except digits and decimal point you could use `'[^.0-9]'`. – Ed Rushton Mar 27 '18 at 15:32
15

It's 2019 and there still doesn't exist a better way to convert money series to numerical series :/ – Austin A Apr 04 '19 at 19:51
5

Use `'[^.0-9\-]'` to keep your negative numbers, otherwise the '-' gets stripped out. – jmcopeland May 02 '22 at 22:33

EdChum · Answer 2 · 2015-09-08T18:31:17.137

You can use the vectorised str methods to replace the unwanted characters and then cast the type to int:

In [81]:
df[df.columns[1:]] = df[df.columns[1:]].apply(lambda x: x.str.replace('$','')).apply(lambda x: x.str.replace(',','')).astype(np.int64)
df

Out[81]:
            state       1st        2nd      3rd
index                                          
0      California  11593820  109264246  8496273
1        New York  10861680   45336041  6317300
2         Florida   7942848   69369589  4697244
3           Texas   7536817   61830712  5736941

dtype change is now confirmed:

In [82]:

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4 entries, 0 to 3
Data columns (total 4 columns):
state    4 non-null object
1st      4 non-null int64
2nd      4 non-null int64
3rd      4 non-null int64
dtypes: int64(3), object(1)
memory usage: 160.0+ bytes

Another way:

In [108]:

df[df.columns[1:]] = df[df.columns[1:]].apply(lambda x: x.str[1:].str.split(',').str.join('')).astype(np.int64)
df
Out[108]:
            state       1st        2nd      3rd
index                                          
0      California  11593820  109264246  8496273
1        New York  10861680   45336041  6317300
2         Florida   7942848   69369589  4697244
3           Texas   7536817   61830712  5736941

score 7 · Answer 3 · answered Jan 23 '18 at 04:44

7

You can also use locale as follows

import locale
import pandas as pd
locale.setlocale(locale.LC_ALL,'')
df['1st']=df.1st.map(lambda x: locale.atof(x.strip('$')))

Note the above code was tested in Python 3 and Windows environment

answered Jan 23 '18 at 04:44

sushmit

4,369
2
35
38

score 7 · Answer 4 · edited Jan 27 '22 at 17:20

7

To convert into integer, use:

carSales["Price"] = carSales["Price"].replace("[$,]", "", regex=True).astype(int)

edited Jan 27 '22 at 17:20

Asclepius

57,944
17
167
143

answered Mar 03 '21 at 16:04

rohit01

101
1
3

Hi, please avoid posting images of output, This output could have been, even a textual format. – Rishabh Kumar Mar 03 '21 at 17:04
If you replace the decimal points, you'll have a half million dollar Honda – InnocentBystander Aug 25 '21 at 04:21
Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 16 '23 at 05:36

Mykola Zotko · Answer 5 · 2022-05-03T06:31:31.673

5

You can use the methodstr.replace and the regex '\D' to remove all nondigit characters or '[^-.0-9]' to keep minus signs, decimal points and digits:

for col in df.columns[1:]:
    df[col] = pd.to_numeric(df[col].str.replace('[^-.0-9]', ''))

edited May 03 '22 at 06:31

answered Sep 13 '21 at 20:01

Mykola Zotko

15,583
3
71
73

2

Use `'[^.0-9\-]'` to keep your negative numbers, otherwise the '-' gets stripped out. – jmcopeland May 02 '22 at 22:33

score -1 · Answer 6 · edited Jul 16 '23 at 05:34

-1

car_sales['Price'] = car_sales['Price'].str.replace('$', '').str.replace(',', '').str.split('.', expand=True)[0].astype(int)

edited Jul 16 '23 at 05:34

moken

3,227
8
13
23

answered Jul 08 '23 at 10:00

lucykingee ude

1

Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 16 '23 at 05:36

converting currency with $ to numbers in Python pandas

6 Answers6

Linked

Related