0

I want to transfer unicode into asci characters, transfer them through a channel that only accepts asci characters and then transform them back into proper unicode.

I'm dealing with the unicode characters like ɑ in Python 3.5.

ord("ɑ") gives me 63 with is the same as what ord("?") also gives me 63. This means simply using ord() and chr() doesn't work. How do I get the right conversion?

Christian
  • 25,249
  • 40
  • 134
  • 225
  • 1
    check http://stackoverflow.com/questions/19527279/python-unicode-to-ascii-conversion – Anoop May 09 '16 at 15:51
  • `ord("ɑ")` doesn't give me 63. Can you provide a self-contained example? `ord` should work fine on unicode strings. – BrenBarn May 09 '16 at 15:51
  • 4
    Try `ord(u"ɑ")`, it should work. Note the `u`. – Mark Ransom May 09 '16 at 15:51
  • Are you sure you're on Python 3? If you are on Python 3, are you sure you have your encodings correct? – user2357112 May 09 '16 at 15:55
  • Interestingly, I either get `593` (Python 3) or a `TypeError` (Python 2). That's from copy-pasting `ord("ɑ")` from the question (without `u` prefix). With the `u` prefix, I get `593` for both Python 2 and 3. –  May 09 '16 at 15:55
  • ```ord``` should work. [```ord(c) Given a string representing one Unicode character, return an integer representing the Unicode code point of that character. For example, ord('a') returns the integer 97 and ord('€') (Euro sign) returns 8364. This is the inverse of chr().```](https://docs.python.org/3/library/functions.html#ord). Did you try ```ascii(ord(thing)```? – wwii May 09 '16 at 16:08

3 Answers3

1

You can convert a number to a hex string with "0x%x" %255 where 255 would be the number you want to convert to hex.

To do this with ord, you could do "0x%x" %ord("a") or whatever character you want.

You can remove the 0x part of the string if you don't need it. If you want to hex to be capitalized (A-F) use "0x%X" %ord("a")

Paul Virally
  • 156
  • 1
  • 8
1

I want to transfer unicode into ascii characters, transfer them through a channel that only accepts ascii characters and then transform them back into proper unicode.

>>> import json
>>> json.dumps('ɑ')
'"\\u0251"'
>>> json.loads(_)
'ɑ'
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

I found my error. I used Python via the Windows console and the Windows console mishandeled the unicode.

Christian
  • 25,249
  • 40
  • 134
  • 225
  • you can write arbitrary Unicode characters to Windows console (and even display BMP characters if the font supports them). See [Python, Unicode, and the Windows console](http://stackoverflow.com/q/5419/4279) – jfs May 10 '16 at 19:50
  • @J.F.Sebastian : I solved my issue by switching from the Windows console to the Python IDLE. I think the question you referce doesn't deal with entering text over the console. – Christian May 10 '16 at 20:29
  • beware: IDLE doesn't support non-BMP characters (e.g., some emoji) otherwise it is a solution if you can't install anything. The solution in my answer there works for the input too. – jfs May 10 '16 at 20:34