If I do:
print "\xE2\x82\xAC"
print len("€")
print len(u"€")
I get:
€
3
1
But if I do:
print '\xf0\xa4\xad\xa2'
print len("")
print len(u"")
I get:
4
2
In the second example, the len() function returned 2 instead of 1 for the one character unicode string u"".
Can someone explain to me why this is the case?