1

I have the following code:

code1 = ("\xd9\xf6\xd9\x74\x24\xf4\x5f\x29\xc9\xbd\x69\xd1\xbb\x18\xb1")
print code1

code2 = open("code.txt", 'rb').read()
print code2

code1 output:

�צ�t$פ_)�½i�»±

code2 output:

"\xd9\xf6\xd9\x74\x24\xf4\x5f\x29\xc9\xbd\x69\xd1\xbb\x18\xb1"

I need code2 (which I read from a file) to have the same output as code1.
How can i solve this ?

joaquin
  • 82,968
  • 29
  • 138
  • 152
Shai
  • 57
  • 1
  • 2
  • 6
  • What do you have in code.txt? a string like that: `"\xd9\xf6\xd9\x74\x24\xf4\x5f\x29\xc9\xbd\x69\xd1\xbb\x18\xb1"` or the bytes represented by this string? – MByD Jul 20 '11 at 20:49
  • I have "\xd9\xf6\xd9\x74\x24\xf4\x5f\x29\xc9\xbd\x69\xd1\xbb\x18\xb1" in my text file. when i put it in the a variable and print it its comes out as binary data which is what i need, but when i read it from a file and it prints it as a string – Shai Jul 20 '11 at 20:55
  • What is the output of `sys.getdefaultencoding()`? – agf Jul 20 '11 at 20:56
  • @Shai - so your solution is in the answer below :) – MByD Jul 20 '11 at 20:57
  • ok, I just realised, that quotes are part of the output, so you just need to strip them off. – tomasz Jul 20 '11 at 21:07

3 Answers3

5

To interpret a sequence of characters such as

In [125]: list(code2[:8])
Out[125]: ['\\', 'x', 'd', '9', '\\', 'x', 'f', '6']

as Python would a string with escaped characters, such as

In [132]: list('\xd9\xf6')
Out[132]: ['\xd9', '\xf6']

use .decode('string_escape'):

In [122]: code2.decode('string_escape')
Out[122]: '\xd9\xf6\xd9t$\xf4_)\xc9\xbdi\xd1\xbb\x18\xb1'

In Python3, the string_escape codec has been removed, so the equivalent becomes

import codecs
codecs.escape_decode(code2)[0]
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
3

This example:

import binascii

code1 = "\xd9\xf6\xd9\x74\x24\xf4\x5f\x29\xc9\xbd\x69\xd1\xbb\x18\xb1"
code2 = "\\xd9\\xf6\\xd9\\x74\\x24\\xf4\\x5f\\x29\\xc9\\xbd\\x69\\xd1\\xbb\\x18\\xb1"

print code1 == binascii.unhexlify(code2.replace('\\x', ''))

prints True.

You can use binascii.unhexlify to convert hexadecimal text representation to binary, but first have to remove \x from the string.

EDIT: I've just realised that double quotes are part of your output. Essentially you need to pass just valid hex string, so everything else need to be stripped off. In your case you need to pass code2.replace('\\x', '').strip('"') to unhexlify. You can use eval and probably will, but consider this Security of Python's eval() on untrusted strings? for future choices.

Community
  • 1
  • 1
tomasz
  • 12,574
  • 4
  • 43
  • 54
-3

print eval(code2) should do the job.

ngn
  • 7,763
  • 6
  • 26
  • 35
  • 1
    Beware of the `evil Eval`... This trick is good enough for learning / debug purposes, but shouldn't be applied to production logic. – mjv Jul 20 '11 at 21:17
  • I thought as the shortest solution this one would be the most elegant, but the voice of the people objects. Perhaps I should have prepended a regex match to assert that code.txt really contains a Python string literal. – ngn Jul 21 '11 at 06:04