Replace comma with dot Pandas

Question

Given the following array, I want to replace commas with dots:

array(['0,140711', '0,140711', '0,0999', '0,0999', '0,001', '0,001',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', 0L, 0L, 0L, 0L, '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,1125688', '0,140711', '0,1125688',
       '0,140711', '0,1125688', '0,140711', '0,1125688', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711'], dtype=object)

I've been trying different ways but I can't figure out how to do this. Also, I have imported it as a pandas DataFrame but can't apply the function:

df
      1-8        1-7
H0   0,140711   0,140711
H1     0,0999     0,0999
H2      0,001      0,001
H3   0,140711   0,140711
H4   0,140711   0,140711
H5   0,140711   0,140711
H6          0          0
H7          0          0
H8   0,140711   0,140711
H9   0,140711   0,140711
H10  0,140711  0,1125688
H11  0,140711  0,1125688
H12  0,140711  0,1125688
H13  0,140711  0,1125688
H14  0,140711   0,140711
H15  0,140711   0,140711
H16  0,140711   0,140711
H17  0,140711   0,140711
H18  0,140711   0,140711
H19  0,140711   0,140711
H20  0,140711   0,140711
H21  0,140711   0,140711
H22  0,140711   0,140711
H23  0,140711   0,140711 

df.applymap(lambda x: str(x.replace(',','.')))

Any suggestions how to solve this?

`df.applymap(lambda x: str(x.replace(',','.')))` does work, replaces comma to dot on `pd.__version__ == '0.18.1'` — Zero, Oct 17 '16 at 09:53
Did you assign back the result? `df =df.applymap(lambda x: str(x.replace(',','.')))` — EdChum, Oct 17 '16 at 09:53
Also it'd be quicker to do this for each column: `df = df.apply(lambda x: x.str.replace(',','.'))` — EdChum, Oct 17 '16 at 09:54
Great @EdChum. I didn't assign back the result. Btw, apply is faster than applymap()? — Juliana Rivera, Oct 17 '16 at 10:13
`apply` works column-wise or row-wise, `applymap` operates on each element so yes in this case `apply` would be faster. You could also do `df.stack().str.replace(',','.').unstack()` — EdChum, Oct 17 '16 at 10:17
I'm not sure if something has changed in Pandas since this was posted, but a simple `%timeit` shows that `applymap` is faster than `apply`. — BeRT2me, May 28 '22 at 22:02

score 60 · Answer 1 · edited May 28 '22 at 20:33

60

If you are reading in with read_csv, you can specify how it interprets decimals with the decimal parameter.

e.g.

your_df = pd.read_csv('/your_path/your_file.csv',sep=';',decimal=',')

From the man pages:

thousands : str, optional

Thousands separator.

decimal : str, default ‘.’

Character to recognize as decimal point (e.g. use ‘,’ for European data).

edited May 28 '22 at 20:33

wjandrea

28,235
9
60
81

answered May 07 '19 at 14:29

Lee

29,398
28
117
170

Sweet! With previous suggestions I was getting errors with treating my data as strings. – jared Oct 14 '20 at 10:11
It does not have the same effect that what the person is asking. By doing so when using to_dict on the dataframe for exemple, it will export with a comma decimal separator while it should be dot. – bloub Nov 21 '22 at 17:19

EdChum · Accepted Answer · 2016-10-17T10:26:42.720

You need to assign the result of your operate back as the operation isn't inplace, besides you can use apply or stack and unstack with vectorised str.replace to do this quicker:

In [5]:
df.apply(lambda x: x.str.replace(',','.'))

Out[5]:
          1-8        1-7
H0   0.140711   0.140711
H1     0.0999     0.0999
H2      0.001      0.001
H3   0.140711   0.140711
H4   0.140711   0.140711
H5   0.140711   0.140711
H6          0          0
H7          0          0
H8   0.140711   0.140711
H9   0.140711   0.140711
H10  0.140711  0.1125688
H11  0.140711  0.1125688
H12  0.140711  0.1125688
H13  0.140711  0.1125688
H14  0.140711   0.140711
H15  0.140711   0.140711
H16  0.140711   0.140711
H17  0.140711   0.140711
H18  0.140711   0.140711
H19  0.140711   0.140711
H20  0.140711   0.140711
H21  0.140711   0.140711
H22  0.140711   0.140711
H23  0.140711   0.140711

In [4]:    
df.stack().str.replace(',','.').unstack()

Out[4]:
          1-8        1-7
H0   0.140711   0.140711
H1     0.0999     0.0999
H2      0.001      0.001
H3   0.140711   0.140711
H4   0.140711   0.140711
H5   0.140711   0.140711
H6          0          0
H7          0          0
H8   0.140711   0.140711
H9   0.140711   0.140711
H10  0.140711  0.1125688
H11  0.140711  0.1125688
H12  0.140711  0.1125688
H13  0.140711  0.1125688
H14  0.140711   0.140711
H15  0.140711   0.140711
H16  0.140711   0.140711
H17  0.140711   0.140711
H18  0.140711   0.140711
H19  0.140711   0.140711
H20  0.140711   0.140711
H21  0.140711   0.140711
H22  0.140711   0.140711
H23  0.140711   0.140711

the key thing here is to assign back the result:

df = df.stack().str.replace(',','.').unstack()

this will NaN integer values on the dataframe – Miguel Tomás Apr 20 '21 at 12:47 — Miguel Tomás, Apr 20 '21 at 12:47
This fails for large dataframes – Dhyana Jan 24 '23 at 15:48 — Dhyana, Jan 24 '23 at 15:48

score 24 · Answer 3 · answered May 10 '21 at 17:59

24

If you need to replace commas with dots in particular columns, try

    data["column_name"]=data["column_name"].str.replace(',','.')

to avoid 'str' object has no attribute 'str' error.

answered May 10 '21 at 17:59

Swathi

351
2
4

why does .str.replace work for me but .replace doesn't? – Fisqkuz Apr 28 '22 at 18:55

Replace comma with dot Pandas

3 Answers3

Linked

Related