Two different print results with same string

Question

I have a csv file containing Spanish words with utf-8 encoding and English words separated by commas. For some reason, if I print the Spanish words, they still contain the utf-8 encoding. But, if I paste the string into a print statement directly, the correct characters will display. Why is this?

words = open('./Spanish Sentences/Englishsentences.csv').read().splitlines()
for word in words:
    print(word)
    var = word.split(',')[0]
    print(var)
    print('La abrac\u00e9')
    var = 'La abrac\u00e9.'
    print(var)

La abrac\u00e9.,I hugged her.,He hugged her.,I hugged them.,I gave her a hug.,
La abrac\u00e9.
La abracé
La abracé.

Related: https://stackoverflow.com/questions/491921/unicode-utf-8-reading-and-writing-to-files-in-python — Cory Kramer, Jun 08 '20 at 19:16

score 1 · Accepted Answer · answered Jun 08 '20 at 19:34

1

The problem is the open function will escape the \ character.
Change this open('./Spanish Sentences/Englishsentences.csv') to open('./Spanish Sentences/Englishsentences.csv', encoding='unicode_escape')

answered Jun 08 '20 at 19:34

Mahmoud Youssef

778
7
16

Two different print results with same string

1 Answers1