I am using a dictionary to store some character pairs in Python (I am replacing umlaut characters). Here is what it looks like:
umlautdict={
'ae': 'ä',
'ue': 'ü',
'oe': 'ö'
}
Then I run my inputwords through it like so:
for item in umlautdict.keys():
outputword=inputword.replace(item,umlautdict[item])
But this does not do anything (no replacement happens). When I printed out my umlautdict, I saw that it looks like this:
{'ue': '\xfc', 'oe': '\xf6', 'ae': '\xc3\xa4'}
Of course that is not what I want; however, trying things like unicode()
(--> Error) or pre-fixing u
did not improve things.
If I type the 'ä' or 'ö' into the replace()
command by hand, everything works just fine. I also changed the settings in my script (working in TextWrangler) to # -*- coding: utf-8 -*-
as it would net even let me execute the script containing umlauts without it.
So I don't get...
Why does this happen? Why and when do the umlauts change from "good to evil" when I store them in the dictionary?
How do I fix it?
Also, if anyone knows: what is a good resource to learn about encoding in Python? I have issues all the time and so many things don't make sense to me / I can't wrap my head around.
I'm working on a Mac in Python 2.7.10. Thanks for your help!