I'm trying to get the index of 'J' in a string that is similar to myString = "███ ███ J ██"
so I use myString.find('J')
but it returns a really high value and if I replace '█' by 'M' or another character of the alphabet I get a lower value. I don't really understand what's the cause of that.
-
I can't reproduce your problem. – Navith Jun 09 '15 at 22:34
-
2Because it isn't an ASCII character. If, for example, Python uses the common UTF8 encoding scheme for its internal strings, this character will be represented by *three* one-byte codes: `0xE2 0x96 0x88`. – Jongware Jun 09 '15 at 22:34
-
3Which python version are you using? It could be an issue with unicode handling in python 2.x. – Aereaux Jun 09 '15 at 22:34
-
`lower value=-1` because except J there is no other alphabet – Ajay Jun 09 '15 at 22:35
-
@Aereaux It is. If you declare it as a Unicode String, i.e. myString = u"███ ███ J ██", find works fine. – Sinkingpoint Jun 09 '15 at 22:36
-
@Jongware and Aereaux have the correct answer... – Shashank Jun 09 '15 at 22:39
3 Answers
Try doing myString = u"███ ███ J ██"
. This will make it a Unicode string instead of the python 2.x default of an ASCII string.
If you are reading it from a file or a file-like object, instead of doing file.read()
, do file.read().encode('utf-8-sig')
.

- 845
- 1
- 8
- 20
To check your encoding run: python -c 'import sys; print(sys.getdefaultencoding())'
For Python 2.x the output is ascii
and this is a default encoding for your programs. To use some non-ascii characters developers predicted a unicode() type. See for yourself. Just create a variable myString = u"███ ███ J ██"
and follow on it .find('J')
method. This u
prefix says to interpreter that it deals with Unicode-encoded string. Then you can use this variable like if it was normal str.
I've used Unicode in some places where I should write UTF-8. For difference check this great answer if you want to.
Unicode is a default encoding in Python 3.x, so this problem does not occur.

- 1
- 1

- 116
- 1
- 4
-
and what about if my string from a file: myFile = open("map/map1", "r") myMap = (myFiler.read()) – mel Jun 09 '15 at 23:07
-
Use `myFile = open("map/map1", "r")` `myMap = (myFiler.read().encode('utf-8-sig'))` – Aereaux Jun 09 '15 at 23:53
Check the settings of the console/ssh client you are using. Set it to be UTF-8.

- 1,166
- 2
- 23
- 38