Regular expression does not match the last end of line

Asked Apr 06 '16 at 14:30

Active Jun 26 '23 at 09:02

Viewed 38 times

I don't understand why it matches the last '\n':

import re

s = 'abc\t'   # no
# s = 'abc\n'   # yes

if re.match('^(\S+)$', s):
    print 'yes'
else:
    print 'no'

As far as I know, '\S' should not match '\t' or '\n', but here '\n' is matched.

edited Jun 26 '23 at 09:02

Dharman

asked Apr 06 '16 at 14:30

Linczh

No, `\s` matches `\t` and `\n`, `\S` is the reverse shorthand character class matching non-whitespaces. – Wiktor Stribiżew Apr 06 '16 at 14:30
Sorry I didn't express the question clear. I update the question and here the question is why WHITESPACE-CHARACTER '\n' is matched while '\t' is not. – Linczh Apr 07 '16 at 01:32
If you look at `help(re)` it explains the special character `$` *"Matches the end of the string or just before the newline at the end of the string."*. So I think you are seeing `$` finish the match before the `\n`, and `\S` is not matching the newline. – TessellatingHeckler Apr 07 '16 at 02:05
That's it! I will try looking at the doc more carefully next time. Thanks. – Linczh Apr 07 '16 at 02:15
If you need to avoid matching an end of line before the last LF, use `$(?!\n)`. Is that what you are looking for? – Wiktor Stribiżew Apr 07 '16 at 05:41
Exactly, I got this from the doc a few hours ago~ – Linczh Apr 07 '16 at 06:01

0 Answers0