pandas .replace not working in Python 2.7

Question

I'm having quite a lot of trouble understanding the .replace from pandas for special characters.

I have a dataframe that I need to change some text for greek letters. I have done it before, on the same code, and it worked perfectly, but for some reason I could not figure out the second time it dit not work.

import pandas as pd

df = pd.DataFrame({'a' = [Aa_alpha_bb, Cc_beta_dd, Ee_gamma_ff]})

#then I did:
df['a'].replace({'_alpha_':'α', '_beta_':'β', '_gamma_':'γ'}, regex = True, inplace = True)

But I get the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128)

I have also tried using df['a'].astype(str), but to no avail

I have no experience using special characters and encoding in python. I'm also new using python 2.7, because the project I'm working on now requires this specific version. Can someone help me?

I can't reproduce this - what kind of environment are you in? If *nix, what is the value of the environment variables 'LC_ALL', 'LC_CTYPE', 'LANG', 'LANGUAGE'? Or are you on WIndows? — snakecharmerb, Aug 10 '21 at 04:35
@MDR If the issue was a missing encoding cookie there would be a `SyntaxError`, not a `UnicodeDecodeError`. — snakecharmerb, Aug 10 '21 at 04:37

Anton van der Wel · Answer 1 · 2021-08-09T20:22:14.907

1

I am pretty sure this has to do with your file not being utf-8 encoded. See other stackoverflow question: UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1

Throwing in a first line being equal to:

# coding: utf-8
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

should do the trick. In python 3 this is by default set to utf8

edited Aug 09 '21 at 20:22

answered Aug 09 '21 at 20:18

Anton van der Wel

451
1
6
20

Don't do this, it's a hack. Set the `PYTHONIOENCODING` environment variable. – snakecharmerb Aug 10 '21 at 12:10

pandas .replace not working in Python 2.7

1 Answers1