-1

I am trying to match some strings using regex. What I want to search is any string that talks about someone's children. Like for example: my son, my daughter, our daughters, etc

So I have written this in Python:

re.match(r'\b(my|our)\b \b(son|daughter|children|child|kid)s?', 'me and my son were')

But some how it doesn't match the my son in the test sentence. Returns None

I have tested this regex here: https://regex101.com/r/ChAy9e/1 and it works fine (5th line in test cases).

I'm not able to figure out what I'm doing wrong.

Thanks!

cs95
  • 379,657
  • 97
  • 704
  • 746
SureshS
  • 589
  • 8
  • 23
  • I don't think the question is duplicate. The concept might be already answered earlier but I was not able to solve the problem myself because my lack of understanding of the difference. If I knew the difference already, this question, neither the original question would appear. The whole purpose is to help when you get stuck somewhere right? – SureshS Sep 25 '17 at 08:53

2 Answers2

2

match matches the regex only at the start of the string; You need to use findall method

>>> re.findall(r'\b(my|our)\b \b(son|daughter|children|child|kid)s?', 'me and my son were')
[('my', 'son')]

match Try to apply the pattern at the start of the string, returning a match object, or None if no match was found.

Chen A.
  • 10,140
  • 3
  • 42
  • 61
  • Great, it worked! Thanks. I'll read `match` and `findall` documentation again. – SureshS Sep 25 '17 at 06:58
  • @SureshS glad I could help. If you found my answer useful, please kindly consider upvoting & accepting it, thanks :) – Chen A. Sep 25 '17 at 06:59
1

As Vinny said, you need re.findall. However, if you want those phrases as one element, you'll want to modify your regex a bit. Try:

In [1]: re.findall(r'\b(?:my|our)\s+(?:son|daughter|kid)s?|children|child\b', 'me and my son were')
Out[1]: ['my son']

Remove capturing groups, so you capture single phrases at a time. I've also optimised your regex a bit, since you don't need to look for childrens and childs (that's incorrect grammar!).

Details

\b          # word boundary
(?:         # open non-capture group
    my          
    |           # 'or' operation
    our         
) 
\s+         # whitespace - one or more
(?:         # open non-capture group
    son        
    |
    daughter
    |
    kid
)
s?          # 's' optional           
|
children
|
child
\b          # word boundary 
cs95
  • 379,657
  • 97
  • 704
  • 746