2

I know it's a pretty simple question. I happened to look at a regex example.

import re
pattern = r'^M?M?M?$'

s = "MDM"

re.search(pattern, s)

May I know why it doesn't match the string s? AFAIK, ? is to specify 0 or 1 occurence. It matches MMM though.

However the same string matches when the pattern is r'M?M?M?$' or r'^M?M?M?'. I am not getting what makes the difference here. Could someone please explain?

Linkid
  • 517
  • 5
  • 17
DineshKumar
  • 1,599
  • 2
  • 16
  • 30

3 Answers3

2

r'^M?M?M?$' is the same as r'^M{0,3}$'. So, your pattern accepts '', 'M', 'MM', 'MMM' strings.

r'M?M?M?$' is the same as r'M{0,3}$ and, actually, accepts all strings, since there is always an empty part at the end of the string:

In [21]: pattern = r'M?M?M?$'

In [22]: re.search(pattern, 'A string without capital m at all')
Out[22]: <_sre.SRE_Match object; span=(33, 33), match=''>
awesoon
  • 32,469
  • 11
  • 74
  • 99
1

The regex

M?M?M?$

matches the last "M" in "MDM". But when you add ^ (beginning of string), it'll try to match from the beginning and it'll fail because M? matches 0 or 1 "M", but not a "D".

On the other regex:

^M?M?M?

The first "M" is matched.

Maroun
  • 94,125
  • 30
  • 188
  • 241
  • Your explanation *`search` looks in the whole string, not from the beginning* does not sound right. It does search for a match from the beginning, but the match does not have to start at the beginning. – Wiktor Stribiżew Feb 23 '16 at 10:22
  • @WiktorStribiżew there's `match` and `search` in Python. [Here's](http://stackoverflow.com/questions/180986/what-is-the-difference-between-pythons-re-search-and-re-match) , that's what I tried to say, maybe I have to make it clearer a bit. – Maroun Feb 23 '16 at 10:24
  • I know the difference very well. It does not mean the `re.search` does not search for a match from the beginning of a string. Otherwise, `^M?` would not match the first `M`. – Wiktor Stribiżew Feb 23 '16 at 10:24
  • @WiktorStribiżew It *does* look, but as opposed to `match`, it doesn't attempt to match from the string beginning. – Maroun Feb 23 '16 at 10:25
1

^ matches the start of a line. $ matches the end of the line.

so 'M?M?M?$' matches the last M in MDM and '^M?M?M?' matches the first M in MDM.

'^M?M?M?$' cannot match MDM because of the D in the middle that is not listed in your regex and the requirement to match the start of the line and the end of the line while there are 0, 1, 2 or 3 M in between.

Simulant
  • 19,190
  • 8
  • 63
  • 98
  • Thanks a lot. "the requirement to match the start of the line and the end of the line while there are 0, 1, 2 or 3 M in between." this statement nailed it. – DineshKumar Feb 23 '16 at 13:55