-2

I have been playing around with the codecs module lately and I stumbled upon this behavior that I find rather weird:
codecs.encode(b'a', 'hex') returns b'61'.

My question is, why? I really didn't expect it to return b'61'. I was expecting b'\x61'.
The former is a bytes object with length 2 (len(b'61') == 2), whereas the latter one is a bytes object with length 1 (len(b'\x61') == 1).

I didn't expect this behavior at all, because b'a', which is supposed to be 1-byte, has became 2-bytes when encoded with the 'hex' codecs.

What would you have done to convert an ASCII character to its hex-encoded bytes representation? What I did was:

codecs.decode(hex(ord('a'))[2:], 'hex')

But I felt like this is kind of a dirty hack.

Teddy Hartanto
  • 494
  • 5
  • 12
  • https://docs.python.org/3/library/codecs.html#binary-transforms, "Convert operand to hexadecimal representation, with two digits per byte" – Ilja Everilä Aug 28 '16 at 22:10
  • 2
    Also as you yourself point out, `b'\x61'` is just another way of saying `b'a'`. They are **the same byte string** written in different **representation**. See http://stackoverflow.com/questions/7784148/understanding-repr-function-in-python. – Ilja Everilä Aug 28 '16 at 22:21
  • My bad. Thanks for pointing out the RTFM advice. – Teddy Hartanto Aug 29 '16 at 20:27

1 Answers1

0

The behaviour of codec is documented, the purpose is to make a text representation of (possibly) binary data.

If you want to convert a character 'a' to a bytes representation of that character using ascii, you don't need the codec module; just use the bytes builtin.

>>> bytes('a','ascii')
b'a'

As noted in the comments, b'a' is equal to b'\x61'

James K
  • 3,692
  • 1
  • 28
  • 36