0

I am trying to convert a string to bytes keeping one backslash but it is represented as two. When I pass the following string \xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff from outside Python, either reading a file or with argv and then I encode() it to get bytes I get b'\\xe8\\x5b\\xfd\\xff\\xff\\xe8\\x56\\xfd\\xff', but I would need it with only one.

Here is the code I currently have.

element = sys.argv[1]
element = element.encode()
print(element)

I have tried many different solutions such as .encode(raw_unicode_escape) to no avail. Any help would be greatly appreciated.

Pedro
  • 11
  • 4
  • 2
    How an object is displayed and how it's represented internally isn't necessarily the same. – Peter Wood Jun 27 '21 at 20:13
  • The thing is that I have a script that looks for bytes and matches them, so when I add this as variable b'\xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff' it finds it without problem, but when I pass it from outside it doesn't, that's how I know the double backslashes are preventing the match. – Pedro Jun 27 '21 at 20:16
  • Does this answer your question? [Converting utf-8 characters to scandic letters](https://stackoverflow.com/questions/68069394/converting-utf-8-characters-to-scandic-letters). Try `element = element.encode('raw_unicode_escape').decode('unicode_escape').encode(); print(element)`. – JosefZ Jun 27 '21 at 20:19
  • 1
    "When I pass the following string " - this is unclear. It looks like your string contains literal backslashes, but we can't be completely sure of what you provided. Please make this clear and provide a [mre]. If this is the case, you might try `your_string.encode('latin1').decode("unicode-escape").encode('latin1')`. – Thierry Lathuille Jun 27 '21 at 20:26
  • Thank you, this is so far the closest I've got but the bytes are changed, giving me the following - b'\xc3\xa8[\xc3\xbd\xc3\xbf\xc3\xbf\xc3\' – Pedro Jun 27 '21 at 20:27
  • Here you can find my original post, that gave me an answer that I thought it was good but it isn't since it ends with a string - https://stackoverflow.com/questions/68137716/pass-bytes-from-bash-to-python – Pedro Jun 27 '21 at 20:29
  • @ThierryLathuille is right a `b'\xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff' == element.encode('latin1').decode("unicode-escape").encode('latin1')` returns **`True`**. – JosefZ Jun 27 '21 at 20:33
  • Oh wow, you right, my program does find them in memory when I use this encoding! Thank you! – Pedro Jun 27 '21 at 20:36

0 Answers0