0

I want to capture the words after me_start: pattern with the following with regular expression me_start: ([\s\S]{1,})(?:end_1|end_2|end3) and till any of the described words in this OR list will be found. But it works in other way and capture the longest possible group.

Example:

me_start: cat end_1 dog end_2 in this sentence I want to capture cat word, but it catches cat end_1 dog words. How to make it in the way I need

milanbalazs
  • 4,811
  • 4
  • 23
  • 45
  • 3
    BTW: `\S=[^\s]` so `[\s\S]` matches *everything* like `.` with `re.DOTALL`... are you sure you want that? IMHO It's clearer to just use `.` with `re.DOTALL` if you want to say "match everything". – Giacomo Alzetta Nov 06 '19 at 14:04

1 Answers1

3

Change the {1,} greedy quantifier to the non-greedy version {1,}?:

>>> re.match(r'me_start: ([\s\S]{1,}?)(?:end_1|end_2|end3)', 'me_start: cat end_1 dog end_2').groups()
('cat ',)

From the re documentation:

Causes the resulting RE to match from m to n repetitions of the preceding RE, attempting to match as few repetitions as possible. This is the non-greedy version of the previous qualifier. For example, on the 6-character string 'aaaaaa', a{3,5} will match 5 'a' characters, while a{3,5}? will only match 3 characters.

Giacomo Alzetta
  • 2,431
  • 6
  • 17