3

I have a csv file containing Spanish words with utf-8 encoding and English words separated by commas. For some reason, if I print the Spanish words, they still contain the utf-8 encoding. But, if I paste the string into a print statement directly, the correct characters will display. Why is this?

words = open('./Spanish Sentences/Englishsentences.csv').read().splitlines()
for word in words:
    print(word)
    var = word.split(',')[0]
    print(var)
    print('La abrac\u00e9')
    var = 'La abrac\u00e9.'
    print(var)
La abrac\u00e9.,I hugged her.,He hugged her.,I hugged them.,I gave her a hug.,
La abrac\u00e9.
La abracé
La abracé.
themrdan
  • 97
  • 1
  • 8

1 Answers1

1

The problem is the open function will escape the \ character.
Change this open('./Spanish Sentences/Englishsentences.csv') to open('./Spanish Sentences/Englishsentences.csv', encoding='unicode_escape')

Mahmoud Youssef
  • 778
  • 7
  • 16