46

Given the following array, I want to replace commas with dots:

array(['0,140711', '0,140711', '0,0999', '0,0999', '0,001', '0,001',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', 0L, 0L, 0L, 0L, '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,1125688', '0,140711', '0,1125688',
       '0,140711', '0,1125688', '0,140711', '0,1125688', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711', '0,140711',
       '0,140711', '0,140711', '0,140711', '0,140711'], dtype=object)

I've been trying different ways but I can't figure out how to do this. Also, I have imported it as a pandas DataFrame but can't apply the function:

df
      1-8        1-7
H0   0,140711   0,140711
H1     0,0999     0,0999
H2      0,001      0,001
H3   0,140711   0,140711
H4   0,140711   0,140711
H5   0,140711   0,140711
H6          0          0
H7          0          0
H8   0,140711   0,140711
H9   0,140711   0,140711
H10  0,140711  0,1125688
H11  0,140711  0,1125688
H12  0,140711  0,1125688
H13  0,140711  0,1125688
H14  0,140711   0,140711
H15  0,140711   0,140711
H16  0,140711   0,140711
H17  0,140711   0,140711
H18  0,140711   0,140711
H19  0,140711   0,140711
H20  0,140711   0,140711
H21  0,140711   0,140711
H22  0,140711   0,140711
H23  0,140711   0,140711 

df.applymap(lambda x: str(x.replace(',','.')))

Any suggestions how to solve this?

Archie
  • 2,247
  • 1
  • 18
  • 35
Juliana Rivera
  • 1,013
  • 2
  • 9
  • 15
  • 4
    `df.applymap(lambda x: str(x.replace(',','.')))` does work, replaces comma to dot on `pd.__version__ == '0.18.1'` – Zero Oct 17 '16 at 09:53
  • Did you assign back the result? `df =df.applymap(lambda x: str(x.replace(',','.')))` – EdChum Oct 17 '16 at 09:53
  • 2
    Also it'd be quicker to do this for each column: `df = df.apply(lambda x: x.str.replace(',','.'))` – EdChum Oct 17 '16 at 09:54
  • Great @EdChum. I didn't assign back the result. Btw, apply is faster than applymap()? – Juliana Rivera Oct 17 '16 at 10:13
  • 1
    `apply` works column-wise or row-wise, `applymap` operates on each element so yes in this case `apply` would be faster. You could also do `df.stack().str.replace(',','.').unstack()` – EdChum Oct 17 '16 at 10:17
  • I'm not sure if something has changed in Pandas since this was posted, but a simple `%timeit` shows that `applymap` is faster than `apply`. – BeRT2me May 28 '22 at 22:02

3 Answers3

60

If you are reading in with read_csv, you can specify how it interprets decimals with the decimal parameter.

e.g.

your_df = pd.read_csv('/your_path/your_file.csv',sep=';',decimal=',')

From the man pages:

thousands : str, optional

Thousands separator.

decimal : str, default ‘.’

Character to recognize as decimal point (e.g. use ‘,’ for European data).

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Lee
  • 29,398
  • 28
  • 117
  • 170
  • Sweet! With previous suggestions I was getting errors with treating my data as strings. – jared Oct 14 '20 at 10:11
  • It does not have the same effect that what the person is asking. By doing so when using to_dict on the dataframe for exemple, it will export with a comma decimal separator while it should be dot. – bloub Nov 21 '22 at 17:19
45

You need to assign the result of your operate back as the operation isn't inplace, besides you can use apply or stack and unstack with vectorised str.replace to do this quicker:

In [5]:
df.apply(lambda x: x.str.replace(',','.'))

Out[5]:
          1-8        1-7
H0   0.140711   0.140711
H1     0.0999     0.0999
H2      0.001      0.001
H3   0.140711   0.140711
H4   0.140711   0.140711
H5   0.140711   0.140711
H6          0          0
H7          0          0
H8   0.140711   0.140711
H9   0.140711   0.140711
H10  0.140711  0.1125688
H11  0.140711  0.1125688
H12  0.140711  0.1125688
H13  0.140711  0.1125688
H14  0.140711   0.140711
H15  0.140711   0.140711
H16  0.140711   0.140711
H17  0.140711   0.140711
H18  0.140711   0.140711
H19  0.140711   0.140711
H20  0.140711   0.140711
H21  0.140711   0.140711
H22  0.140711   0.140711
H23  0.140711   0.140711

In [4]:    
df.stack().str.replace(',','.').unstack()

Out[4]:
          1-8        1-7
H0   0.140711   0.140711
H1     0.0999     0.0999
H2      0.001      0.001
H3   0.140711   0.140711
H4   0.140711   0.140711
H5   0.140711   0.140711
H6          0          0
H7          0          0
H8   0.140711   0.140711
H9   0.140711   0.140711
H10  0.140711  0.1125688
H11  0.140711  0.1125688
H12  0.140711  0.1125688
H13  0.140711  0.1125688
H14  0.140711   0.140711
H15  0.140711   0.140711
H16  0.140711   0.140711
H17  0.140711   0.140711
H18  0.140711   0.140711
H19  0.140711   0.140711
H20  0.140711   0.140711
H21  0.140711   0.140711
H22  0.140711   0.140711
H23  0.140711   0.140711

the key thing here is to assign back the result:

df = df.stack().str.replace(',','.').unstack()

EdChum
  • 376,765
  • 198
  • 813
  • 562
24

If you need to replace commas with dots in particular columns, try

    data["column_name"]=data["column_name"].str.replace(',','.')

to avoid 'str' object has no attribute 'str' error.

Swathi
  • 351
  • 2
  • 4