4
a = b'\x00\x01'
ra = repr(a)   # ra == "b'\\x00\\x01'"
assert invert_repr(ra) == a

What is the correct form of invert_repr? string_escape & unicode_escape?

vaultah
  • 44,105
  • 12
  • 114
  • 143
ShenLei
  • 567
  • 1
  • 5
  • 17
  • 2
    Are you using `repr()` to serialise and unserialise data? Don't, Python syntax is not meant to be a serialisation format. Use pickle, marshall or json instead. – Martijn Pieters Jul 04 '16 at 08:51

1 Answers1

5

Use eval or equivalent:

from ast import literal_eval
a = b'\x00\x01'
ra = repr(a)
assert literal_eval(ra) == eval(ra) == a # no error

ast.literal_eval is safer than eval.

Community
  • 1
  • 1
vaultah
  • 44,105
  • 12
  • 114
  • 143
  • Yes, but it is really slow. Could you give me a string-based operation? – ShenLei Jul 04 '16 at 08:49
  • 2
    @ShenLei: no, there are no other options. You have a Python byte string literal, these are your options to interpret it again. Why do you need this? – Martijn Pieters Jul 04 '16 at 08:50
  • Because I read data from a text file and printed failed-to-decode content with repr() to another file. Now I want to process the latter. – ShenLei Jul 04 '16 at 08:53
  • 2
    @ShenLei: then why are you complaining about speed? You are debugging, not writing a performance-sensitive application. And why not just pickle then? – Martijn Pieters Jul 04 '16 at 08:55
  • En, some kind of debugging. The text file is about 2GB and mixed with several encodings and formats. There are ~1000 such files. I have to process them round-by-round. So eval() is much slower than simple string operation. pickle is not human-readable. I hope there is a decoder to parse '\x??' string. – ShenLei Jul 04 '16 at 09:04
  • 1
    @ShenLei: why did you write 2GB using `repr()` representation instead of writing the bytes directly? (it is not very hard to parse a Python `bytes` literal in C without `ast.literal_eval()` but you shouldn't need to -- here're [lexical definitions for the `bytes` literal](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)) – jfs Jul 04 '16 at 11:32