0

I'm trying to read a byte object from a text file, but whenever I read from the text file I get double backslashes in the bytes object and I cannot find out how to revert them to single backslashes. The file is opened as open(file, 'rb'). I've tried using encode and decode and I've also tried using eval(str(my_string).replace('\\\\','\\')) as detailed in other answers, but all have returned the error: SyntaxError: (value error) invalid \x escape at position 372. The string I am trying to read is: \xde\xcct\x18\xe5*\x91\xcc\xf1\xb4\xe9\xc2\x97BhR\x87\xd6x\xd8\x83\x8b\xc2\x08

Edit:

The answers detailed in Reading utf-8 escape sequences from a file and other questions haven't helped, since I still get a unicode escape error when trying the methods.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • 1
    The file is in bytes or is it the escaped string? Have you tried printing it? –  Jun 12 '18 at 11:41
  • The file just contains the escaped string: `\xde\xcct\x18\xe5*\x91\xcc\xf1\xb4\xe9\xc2\x97BhR\x87\xd6x\xd8\x83\x8b\xc2\x08` – Olly Johnstone Jun 12 '18 at 11:43
  • Possible duplicate of [how do I .decode('string-escape') in Python3?](https://stackoverflow.com/questions/14820429/how-do-i-decodestring-escape-in-python3#14820462), then –  Jun 12 '18 at 11:48

1 Answers1

1

If your string coming from file with characters escaped by backslashes, the backslashes are correctly backlashed themselves and then you can see the single backslash as double backslashes. What you certainly want is getting the escaped characters translated. You can use codecs module with unicode_escape for this:

with codecs.open("<yourfile>", 'r', encoding="unicode_escape") as fr:
    print(fr.read())

If you are encountering errors, you have a flag to decide what will happen by opening the file with "unicode_escape" encoding parameter.

with codecs.open("<yourfile>", 'r', encoding="unicode_escape", errors="ignore") as fr:
    print(fr.read())

You can see a full list of error handlers here: https://docs.python.org/3/library/codecs.html#error-handlers

colidyre
  • 4,170
  • 12
  • 37
  • 53
  • Just tried that, got error: `UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 372-374: truncated \xXX escape` (The actual string is massive so I only showed the first part) – Olly Johnstone Jun 12 '18 at 11:54
  • It seems that you have a problematic file with incorrect escaped sequences in it. The easiest solution would be to handle the errors, I will update my answer. – colidyre Jun 12 '18 at 12:06