0

Why is it, that the encoding changes in Python 2.7 when I iterate over the items of a list?

test_list = ['Hafst\xc3\xa4tter', 'asbds@ages.at']

Printing the list:

print(test_list)

gets me this output:

['Hafst\xc3\xa4tter', 'asbds@ages.at']

So far, so good. But why is it, that when I iterate over the list, such as:

for item in test_list:
    print(item)

I get this output:

Hafstätter
asbds@ages.at

Why does the encoding change (does it?? And how can I change the encoding within the list?

1 Answers1

1

The encoding isn't changing, they are just different ways of displaying a string. One shows the non-ASCII bytes as escape codes for debugging:

>>> test_list = ['Hafst\xc3\xa4tter', 'asbds@ages.at']
>>> print(test_list)
['Hafst\xc3\xa4tter', 'asbds@ages.at']
>>> for item in test_list:
...     print(item)
...     
Hafstätter
asbds@ages.at

But they are equivalent:

>>> 'Hafst\xc3\xa4tter' == 'Hafstätter'
True

If you want to see lists displayed with the non-debugging output, you have to generate it yourself:

>>> print("['"+"', '".join(item for item in test_list) + "']")
['Hafstätter', 'asbds@ages.at']

There is a reason for the debugging output:

>>> a = 'a\xcc\x88'
>>> b = '\xc3\xa4'
>>> a
'a\xcc\x88'
>>> print a,b   # should look the same, if not it is the browser's fault :)
ä ä
>>> a==b
False
>>> [a,b]      # In a list you can see the difference by default.
['a\xcc\x88', '\xc3\xa4']
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251