4

So I am pretty sure this is a dumb question, but I am trying to get a deeper understanding of the python chr() function. Also, I am wondering if it is possible to always have the integer argument three digits long, or just a fixed length for all ascii values?

chr(20) ## '\x14'
chr(020) ## '\x10'

Why is it giving me different answers? Does it think '020' is hex or something? Also, I am running Python 2.7 on Windows! -Thanks!

Chris Nguyen
  • 160
  • 1
  • 4
  • 14
  • 3
    In Python 2, a number starting with `0` is octal. In Python 3, it's a syntax error. – Mark Ransom Jan 28 '15 at 03:33
  • So is there a way to always have the integer argument be a fixed length? – Chris Nguyen Jan 28 '15 at 03:34
  • Only if you don't mind expressing all of your integers in octal digits – Jeremy Friesner Jan 28 '15 at 03:41
  • No, that won't work. The length is only fixed after the decimal number 64. :( – Chris Nguyen Jan 28 '15 at 03:55
  • I am trying to create a single block of integers from a string, and I was trying to exploit the ways to differentiate where each character is within the block of integers. So if every x numbers represents a single char, it'll save me alot of time lol – Chris Nguyen Jan 28 '15 at 04:22
  • possible duplicate of [how to avoid python numeric literals beginning with "0" being treated as octal?](http://stackoverflow.com/questions/11513456/how-to-avoid-python-numeric-literals-beginning-with-0-being-treated-as-octal) – Brent Washburne Jul 02 '15 at 19:46

2 Answers2

1

There is nothing to do with char. It is all about Numeric literals. And it is cross-language. 0 indicates oct and 0x indicates hex.

print 010 # 8
print 0x10 # 16
John Hua
  • 1,400
  • 9
  • 15
  • Is there a way or function that allows chr() to only except decimal integers instead of it evaluating everything? It's kind of annoying lol – Chris Nguyen Jan 28 '15 at 04:09
  • 3
    @ChrisNguyen, it has nothing to do with `chr`. The compiler creates the `int` object that gets passed as an argument. Integer literals can be represented in decimal (10), octal (8), and hexadecimal (16). You can't disable that. You can force the base by creating the `int` at runtime, e.g. `int('010', 10) == 10`. – Eryk Sun Jan 28 '15 at 04:16
  • 1
    @Chris Nguyen: Just to make it clear: `chr` doesn't evaluate everything. `16`, `0x10`, `020` (or `0o20`) and `0b10000` are all the same number. They look different in the _code_, but in _memory_ they're equal. – Matthias Jan 28 '15 at 07:01
  • An overview from parser to compiler. Look at `code = compile('chr(020)', '', 'eval');` `dis.dis(code)` The `LOAD_CONST` instruction loads the `int` object from `code.co_consts[0]`. Next look at the abstract syntax tree (as the compiler sees Python code): `astree = ast.parse('chr(020)', '', 'eval');` `ast.dump(astree)`. The literal is already an `int` here too, i.e. `astree.body.args[0].n == 16`. Finally, look at the parser's syntax tree: `stree = parser.expr('chr(020)');` `stree.totuple()`. There you see the token `(2, '020')`, where `token.tok_name[2] == 'NUMBER'`. – Eryk Sun Jan 28 '15 at 07:46
0

It makes sense to explain chr and ord together.

You are obviously using Python2 (because of the octal problem, Python3 requires 0o as the prefix), but I'll explain both.

In Python2, chr is a function that takes any integer up to 256 returns a string containing just that extended-ascii character. unichr is the same but returns a unicode character up to 0x10FFFF. ord is the inverse function, which takes a single-character string (of either type) and returns an integer.

In Python3, chr returns a single-character unicode string. The equivalent for byte strings is bytes([v]). ord still does both.

o11c
  • 15,265
  • 4
  • 50
  • 75