-1

I have been tring to replace german letters into English way of writing:

Ä -> Ae
Ü -> Ue
ß -> ss

I tried this way:

re.sub("ö","oe",wordLineElements)
re.sub("Ö","Oe",wordLineElements)
re.sub("ä","ae",wordLineElements)
re.sub("Ä","Ae",wordLineElements)
re.sub("ü","ue",wordLineElements)
re.sub("Ü","Ue",wordLineElements)
re.sub("ß","ss",wordLineElements)

but looks like it does not work, so I need to do it with one re.sub()

what is the regex way of doing it?

and if it's ok, what is a general way of using regex?

Samara92
  • 85
  • 11
  • Use a simple `s = s.replace("x", "Yy").replace(...,...)...` – Wiktor Stribiżew Jun 29 '16 at 18:45
  • @WiktorStribiżew I Could do that of course, but I though to develop my knowlage, I would like to see how to change it in regex way, so I could use it in the future – Samara92 Jun 29 '16 at 18:47
  • 3
    `re.sub` returns a new string with the replacement. So you need something like `wordLineElements = re.sub("ö","oe",wordLineElements)`. There is also a great answer [here](http://stackoverflow.com/questions/6116978/python-replace-multiple-strings) that does multiple replacements with some clever code. – GWW Jun 29 '16 at 18:47
  • @GWW I would argue that your comment makes for a good answer. – Ramon Jun 29 '16 at 18:55
  • There is no knowledge to gain in using regexp for such case. The best answer is by @WiktorStribiżew – Krzysztof Krasoń Jun 29 '16 at 18:58
  • @Ramon I would also agree that his answer is as good enough as needed to make the code works. but I still would like to have it in Regex. I looked though in the Internet, but it was not clear enough, this why I asked here, so I could learn by example thank you for your answer :) – Samara92 Jun 29 '16 at 19:00

4 Answers4

6

You don't need regular expressions, str.translate() would be a better choice:

d = {
    "ö": "oe",
    "Ö": "Oe",
    "ä": "ae",
    "Ä": "Ae",
    "ü": "ue",
    "Ü": "Ue",
    "ß": "ss"
}

s = "Ä test ß test Ü"
print(s.translate({ord(k): v for k, v in d.items()}))

Prints:

Ae test ss test Ue
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
2

The issue is that re.sub doesn't modify the string in place, it returns a new string. Try:

wordLineElements = re.sub("ö","oe",wordLineElements)
wordLineElements = re.sub("Ö","Oe",wordLineElements)
wordLineElements = re.sub("ä","ae",wordLineElements)
wordLineElements = re.sub("Ä","Ae",wordLineElements)
wordLineElements = re.sub("ü","ue",wordLineElements)
wordLineElements = re.sub("Ü","Ue",wordLineElements)
wordLineElements = re.sub("ß","ss",wordLineElements)
PYOak
  • 271
  • 1
  • 3
2

re.sub returns a new string with the replacement. So you need something like wordLineElements = re.sub("ö","oe",wordLineElements). There is also a great answer here that does multiple replacements with some clever code

Community
  • 1
  • 1
GWW
  • 43,129
  • 11
  • 115
  • 108
0

I guess you have already got the solution, but here's it if you want to do with regex re module:

>>> sub_dict = {
                u"ö": "oe",
                u"Ö": "Oe",
                u"ä": "ae",
                u"Ä": "Ae",
                u"ü": "ue",
                u"Ü": "Ue",
                u"ß": "ss"
               }
>>> sub_regex = re.compile("(%s)"%"|".join([german_letter.decode('UTF-8') for german_letter in sub_dict.iterkeys()]))
>>> sub_regex.sub(lambda x: sub_dict[x.group(0)], u'asdasdsüadsadas')
102: u'asdasdsueadsadas'
Devi Prasad Khatua
  • 1,185
  • 3
  • 11
  • 23