14

How to reverse re.escape? This blog from 2007 says there is no reverse function, but is that still true, ten years later?

Python 2's decode('string_escape') doesn't work on all escaped chars (such as space).

>>> re.escape(' ')
'\\ '
>>> re.escape(' ').decode('string-escape')
'\\ '

Python 3: Some suggest unicode_escape or codec.escape_decode or ast.literal_eval but no luck with spaces.

>>> re.escape(b' ')
b'\\ '
>>> re.escape(b' ').decode('unicode_escape')
'\\ '
>>> codecs.escape_decode(re.escape(b' '))
(b'\\ ', 2)
>>> ast.literal_eval(re.escape(b' '))
ValueError: malformed node or string: b'\\ '

So is this really the only thing that works?

>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
Willem
  • 3,043
  • 2
  • 25
  • 37

1 Answers1

10

So is this really the only thing that works?

>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '

Yes. The source for the re module contains no unescape() function, so you're definitely going to have to write one yourself.

Furthermore, the re.escape() function uses str.translate()

def escape(pattern):
    """
    Escape special characters in a string.
    """
    if isinstance(pattern, str):
        return pattern.translate(_special_chars_map)
    else:
        pattern = str(pattern, 'latin1')
        return pattern.translate(_special_chars_map).encode('latin1')

… which, while it can transform a single character into multiple characters (e.g. [\[), cannot perform the reverse of that operation.

Since there's no direct reversal of escape() available via str.translate(), a custom unescape() function using re.sub(), as described in your question, is the most straightforward solution.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160