0

I thought a qualifier would become non-greedy if I put " ? " after it. So I expected that I could get " ax: yz_bc " as the returned string by the following code:

import re
data = 'aaaaaax: yz_bcd'
p = re.compile('a.*?bc')

ret = p.search(data)
if ret is not None:
  print(f'{ret.group(0)}')
else:
  print('Not matched')

However, the string which was actually returned was " aaaaaax: yz_bc ". It was the same string which the code returned when " ? " was not added. How to explain this phenomenon? Is it because the match must start from the leftmost?

And what change should I make to my code to get " ax: yz_bc " as the returned string?

  • 1
    This is a common duplicate. The simplest solution by far is to say what you mean: "don't match any a:s" is `[^a]*` – tripleee May 20 '22 at 11:59
  • The main point is to exclude the left delimiter from the `.*?` pattern, and in this case, it may be `[^a]*?`, see [the regex demo](https://regex101.com/r/YiAKir/2). – Wiktor Stribiżew May 20 '22 at 12:08
  • without `?`: `'.*(a.*bc)'` and get `group(1)` instead of `group(0)` – furas May 20 '22 at 14:19
  • to tripleee, Wiktor Stribiżew and furas: Thank you all. I think the solution " [^a]*? " is close to what I intended to do. I will adopt this solution. – thomas_chang May 20 '22 at 15:13

0 Answers0