When I do:
text = u"奥巴马讲话"
for c in text:
print c
I got the expected result:
奥
巴
马
讲
话
But if I do:
text = u"€"
for c in text:
print c
I got:
�
�
€
I'm expecting to get:
€
Why is this? I think it has something to do with the following fact:
In [1]: u"".encode("utf8")
Out[1]: '\xf0\xa4\xad\xa2'
"" is encoded using 4 bytes.
How can I loop through an unicode string that has this kind of encoding?
Something like u"".