I have to convert all the latin characters to their corresponding English alphabets. Can I use Python to do it? Or is there a mapping available?
Unicode values to non-unicode characters
Ramírez Sánchez
should be converted to Ramirez Sanchez
.
I have to convert all the latin characters to their corresponding English alphabets. Can I use Python to do it? Or is there a mapping available?
Unicode values to non-unicode characters
Ramírez Sánchez
should be converted to Ramirez Sanchez
.
It looks like what you want is accent removal. You can do this with:
def strip_accents(text):
return ''.join(char for char in
unicodedata.normalize('NFKD', text)
if unicodedata.category(char) != 'Mn')
>>> strip_accents('áéíñóúü')
'aeinouu'
>>> strip_accents('Ramírez Sánchez')
'Ramirez Sanchez'
This works fine for Spanish, but note that it doesn't always work for other languages.
>>> strip_accents('ø')
'ø'