Reversing Python's re.escape

Question

How to reverse re.escape? This blog from 2007 says there is no reverse function, but is that still true, ten years later?

Python 2's decode('string_escape') doesn't work on all escaped chars (such as space).

>>> re.escape(' ')
'\\ '
>>> re.escape(' ').decode('string-escape')
'\\ '

Python 3: Some suggest unicode_escape or codec.escape_decode or ast.literal_eval but no luck with spaces.

>>> re.escape(b' ')
b'\\ '
>>> re.escape(b' ').decode('unicode_escape')
'\\ '
>>> codecs.escape_decode(re.escape(b' '))
(b'\\ ', 2)
>>> ast.literal_eval(re.escape(b' '))
ValueError: malformed node or string: b'\\ '

So is this really the only thing that works?

>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '

Why do you need this? Why not just keep a copy of the original string? — Bryan Oakley, Apr 27 '17 at 15:54

score 10 · Accepted Answer · answered Feb 23 '19 at 09:44

So is this really the only thing that works?
>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '

Yes. The source for the re module contains no unescape() function, so you're definitely going to have to write one yourself.

Furthermore, the re.escape() function uses str.translate() …

def escape(pattern):
    """
    Escape special characters in a string.
    """
    if isinstance(pattern, str):
        return pattern.translate(_special_chars_map)
    else:
        pattern = str(pattern, 'latin1')
        return pattern.translate(_special_chars_map).encode('latin1')

… which, while it can transform a single character into multiple characters (e.g. [ → \[), cannot perform the reverse of that operation.

Since there's no direct reversal of escape() available via str.translate(), a custom unescape() function using re.sub(), as described in your question, is the most straightforward solution.

Reversing Python's re.escape

1 Answers1

Linked

Related