-1

My goal in the below was to create a regex that would match the string "a.b.c.", (where the . are actual periods), but not match strings like (say) "ga.b.c.h" (i.e., non-space, alphanumeric characters before and after the "a.b.c." part).

My thinking was to use the \b operator, and of course I had to also escape the periods in the expression in my regex. The Python 2 documentation states (https://docs.python.org/2/library/re.html) that \b is formally the boundary between \w and \W.

I do not understand why this expression fails to match:

>>> reg = re.compile(r'\ba\.b\.c\.\b')
>>> bool(re.match(reg, "a.b.c."))
False

Can anyone here enlighten me?

dipankar
  • 45
  • 3
  • do you want to match `a.b.c. d.e.f`? if yes what should be matched, both? – Ashish Ranjan Oct 25 '17 at 16:44
  • for your example, let's just assume that only the a.b.c. would be matched. – dipankar Oct 25 '17 at 16:45
  • If you need no solution to a problem but an explanation what a word boundary is, your question needs no answer, it is a dupe. There are tons of questions about why a regex with `\b` does not match a string. If you need a solution to a problem, modify the question. – Wiktor Stribiżew Oct 25 '17 at 17:39

1 Answers1

1

There is no word boundary between a non-word character and the end of the string.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Ah, duh. So this works: >>> reg = re.compile(r'\ba\.b\.c\b') >>> bool(re.match(reg, "a.b.c.")) True Thank you! – dipankar Oct 25 '17 at 16:52
  • As I note below, Wiktor's answer below is a nice solution for a class of related issues. – dipankar Oct 25 '17 at 17:30
  • Wiktor, perhaps it's because I'm new to SO conventions of discourse, but I found your solution very helpful, and I think my specific question has nuances that are distinct from the one you think my question is a duplicate of. And so I'm disappointed that you deleted your answer from my question. – dipankar Oct 25 '17 at 17:34