1

I have a Python string of bytes data. An example string looks like this:

string = "b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"

It is a string, it not not bytes. I wish to convert it to bytes. Normal approaches (like encode) yield this:

b'\\xabVJ-K\\xcd+Q\\xb2R*.M*N.\\xcaLJU\\xd2QJ\\xceH\\xcc\\xcbK\\xcd\\x01\\x89\\x16\\xe4\\x97\\xe8\\x97d&g\\xa7\\x16Y\\x85\\x06\\xbb8\\xeb\\x02\\t\\xa5Z\\x00'

which leads to issues (note the addition of all the extra slashes).

I've looked through 10+ potential answers to this question on SO and only one of them works, and its a solution I'd prefer not to use, for obvious reasons:

this_works = eval(string)

Is there any way to get this to work without eval? Other potential solutions I've tried, that failed:

Option 1 Option 2 Option 3

sjakobi
  • 3,546
  • 1
  • 25
  • 43
Bryant
  • 3,011
  • 1
  • 18
  • 26

1 Answers1

1

I assume that you have python-like string representation in variable s:

s = r"b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"

Yes, if you eval this then you got real python bytes object. But you can try parse it with ast module:

import ast
s = r"b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"
tree = ast.parse(s)
value = tree.body[0].value.value
print(type(value), value)

This will output your bytes object:

<class 'bytes'> b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'
ont.rif
  • 1,068
  • 9
  • 18
  • yeah, I dont want to use eval or the like, I'm reading this data from a file, its a mixed format file (so has normal text with binary/bytes data). Just evaluating like that seems like an issue – Bryant Apr 05 '21 at 14:49
  • actually, I stand corrected - it looks like ast parse will not eval anything, let me give this a try – Bryant Apr 05 '21 at 14:50
  • It does work. I wish there were a better way, but thats ok. I wonder if there is a way to do it with struct. I tried a few things, but couldnt get any of them to work. – Bryant Apr 05 '21 at 14:55
  • I suppose that there is no simple solution because representation is _very_ python-specific (non-ascii bytes in `\x00` format and special wrapping `b'...'`). Python `struct` just works with raw bytes objects. – ont.rif Apr 05 '21 at 14:59