I have an ui which takes german language among other things and translate these in english sentences.
# -*- coding: utf-8 -*-
from googletrans import Translator
def tr(s)
translator = Translator()
return translator.translate(wordDE,src='de',dest='en').text
Sometimes I get weird characters from the translator. For example:
DE: Pascal und PHP sind Programmiersprachen für Softwareentwickler und Ingenieure.
googletrans EN(utf8): Pascal and PHP are programming languages for software developers and engineers.
This is how the string looks in utf8 format. When I open it with the windows textEditor, it looks like this:
googletrans EN: Pascal and PHP are programming languages ​​for software developers and engineers.
As you can see before the "for software" are 2 weird characters, which the translate()-function returns. These characters are also in the "googletrans EN(utf8)"-string. You can't see them, but when you skip through the string with the arrow keys, the cursor doesn't move for the "for software" for 2 clicks. So the characters are there but not seen. (Maybe you can't do it here because the string is already formatted from the website)
Sometimes there also occur other characters which can't be seen after the translation.
I need this characters eliminated. I can't go for ascii-only, because i need to safe also german-characters like "ö,ä,ü,ß" in a txt-file. Is this maybe just an encoding issue which I don't understand or what is wrong there?