0

I have a string in Bangla and I'm trying to access characters by index.

# -*- coding: utf-8 -*-
bstr = "তরদজ"
print bstr # This line is working fine
for i in bstr:
    print i, # question marks are printed

I don't know why it isn't working.

Colonel Thirty Two
  • 23,953
  • 8
  • 45
  • 85
dragfire
  • 435
  • 1
  • 7
  • 20
  • 1
    This is thi same issue as in How to handle multibyte string in Python http://stackoverflow.com/q/8346608/802365 – Édouard Lopez May 11 '15 at 15:38
  • 1
    As your variable name indicate `bstr`is a byte string, not an unicode string https://docs.python.org/2/tutorial/introduction.html#unicode-strings – Édouard Lopez May 11 '15 at 15:40

1 Answers1

3

Turn it into unicode:

>>> bstr = "তরদজ"
>>> for i in bstr.decode('utf-8'):
...     print i
... 
ত
র
দ
জ
famousgarkin
  • 13,687
  • 5
  • 58
  • 74
  • 1
    An alternative way instead of explicitly calling `decode` is to assign an unicode string literal to `bstr` by prepending the string literal with an `u`. -> `bstr = u"তরদজ"` – halex May 11 '15 at 15:58