Pattern Matching Reg Ex in Python

Question

I am a beginner for the regular expressions pattern matching in python. please help me to solve this problem.

I want to extract some texts from the given string. Please check the below example.

String : "keyword//match1/match2/more_text_with_/_more_and_more_/_texts"

I need to extract "match1" and "match2"

I wrote the following python code to do that...

import re
astr = 'keyword//match1/match2/more_text_with_/_more_and_more_/_texts'
match = re.search('keyword//(.*)/(.*)/.*', astr)

print("match1 : ", match.group(1))
print("match2 : ", match.group(2))

The result is...

match1 :  match1/match2/more_text_with_                                                                     
match2 :  _more_and_more_

I read about "How Regex Engine Works" from here https://www.regular-expressions.info/engine.html

And I can understand why this result comes. But I have no idea to write a regular expression to get my required matching texts.

Please help me with this.

Thank you very much,

You are an awsome guy brother. It's working. Thank you very much for your quick help. Please add this as an answer. I can give you an upvote. :) — Dilanka Rathnayake, Sep 12 '19 at 21:19
There are duplicates about this issue. This page https://stackoverflow.com/questions/22444/my-regex-is-matching-too-much-how-do-i-make-it-stop and this page https://stackoverflow.com/questions/7014903/my-regular-expression-matches-too-much-how-can-i-tell-it-to-match-the-smallest explains it in detail. — The fourth bird, Sep 12 '19 at 21:26

score 0 · Answer 1 · answered Sep 12 '19 at 21:30

.* is greedy and will match as many characters as possible, instead you could use .*? which will match as few characters as possible.

import re

astr = 'keyword//match1/match2/more_text_with_/_more_and_more_/_texts'
match = re.search(r'keyword//(.*?)/(.*?)/.*?', astr)

print("match1 : ", match.group(1))
print("match2 : ", match.group(2))

Booboo · Answer 2 · 2019-09-12T21:48:40.007

Another way not using the non-greedy .*? (See answer posted by @NegativeChameleon):

match = re.search(r'keyword//([^\/]*)/([^\/]*)/', astr)

[^\/]* says to match 0 or more characters that are not a '/' and you can be as greedy as you want!

import re

astr = 'keyword//match1/match2/more_text_with_/_more_and_more_/_texts'
match = re.search(r'keyword//([^\/]*)/([^\/]*)/', astr)

print("match1:", match.group(1))
print("match2:", match.group(2))

score 0 · Answer 3 · answered Sep 12 '19 at 22:45

look at this link

import re
astr = 'keyword//match1/match2/more_text_with_/_more_and_more_/_texts'
match = re.search('keyword//([^/]*)/([^/]*)/.*/', astr)

print("match1 : ", match.group(1))
print("match2 : ", match.group(2))

output:

('match1 : ', 'match1')
('match2 : ', 'match2')

base on link

score 0 · Answer 4 · answered Sep 12 '19 at 23:29

re.findall('(.+?)/',s.replace('keyword//',''))[0:2]

would also work

examples:

s = 'keyword//match1/match2/more_text_with_/_more_and_more_/_texts'

output:

['match1', 'match2']

s = 'keyword//ma13*41$?tch1/mad4#$(#01tch2/more_text_with_/_more_and_more_/_texts'

output:

['ma13*41$?tch1', 'mad4#$(#01tch2']

Pattern Matching Reg Ex in Python

4 Answers4