You need escape that \
like this:
>>> import re
>>> x = 'the meaning\nof life'
>>> re.sub("([,\w])\n(\w)", "\1 \2", x)
'the meanin\x01 \x02f life'
>>> re.sub("([,\w])\n(\w)", "\\1 \\2", x)
'the meaning of life'
>>> re.sub("([,\w])\n(\w)", r"\1 \2", x)
'the meaning of life'
>>>
If you don't escape it, the output is \1
, so:
>>> '\1'
'\x01'
>>>
That's why we need use '\\\\'
or r'\\'
to display a signal \
in Python RegEx.
However about that, from this answer:
If you're putting this in a string within a program, you may actually need to use four backslashes (because the string parser will remove two of them when "de-escaping" it for the string, and then the regex needs two for an escaped regex backslash).
And the document:
As stated earlier, regular expressions use the backslash character ('\'
) to indicate special forms or to allow special characters to be used without invoking their special meaning. This conflicts with Python's usage of the same character for the same purpose in string literals.
Let's say you want to write a RE that matches the string \section
, which might be found in a LaTeX file. To figure out what to write in the program code, start with the desired string to be matched. Next, you must escape any backslashes and other metacharacters by preceding them with a backslash, resulting in the string \\section
. The resulting string that must be passed to re.compile()
must be \\section
. However, to express this as a Python string literal, both backslashes must be escaped again.
Another way as brittenb suggested, you don't need RegEx in this case:
>>> x = 'the meaning\nof life'
>>> x.replace("\n", " ")
'the meaning of life'
>>>