Consider this:
s = u"おはよう"
print len(s)
for c in s: print c
The output is
4
お
は
よ
う
which is what I expect
Now with emojis:
s = u"hi "
Output is
5
h
i
????
????
Why is that? How can I fix it? I have looked at various links before but can't get my head around it Ideally I would like a solution that works both for japanese AND emoticons but if it is for ascii and emoticons only Im fine with it too