I'm reading python doc of re
library and quite confused by the following paragraph:
Regular expressions use the backslash character ('\') to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python’s usage of the same character for the same purpose in string literals; for example, to match a literal backslash, one might have to write '\\\\' as the pattern string, because the regular expression must be \\, and each backslash must be expressed as \\ inside a regular Python string literal.
How is \\\\
evaluated?
\\\\
-> \\\
-> \\
cascadingly
or \\\\
-> \\
in pairs?
I know \
is a meta character just like |
, I can do
>>> re.split('\|', 'a|b|c|d') # split by literal '|'
['a', 'b', 'c', 'd']
but
>>> re.split('\\', 'a\b\c\d') # split by literal '\'
Traceback (most recent call last):
gives me error, it seems that unlike \|
the \\
evaluates more than once.
and I tried
>>> re.split('\\\\', 'a\b\c\d')
['a\x08', 'c', 'd']
which makes me even more confused...