0

I need to find the first occurrence of a string between two substrings in a large string in python, and I'm getting some unexpected behavior. Here's an example:

import re
str = 'STUFF start STUFF I CARE ABOUT end STUFF end STUFF end'
regex = re.compile('start.*end',re.DOTALL)
stufficareabout = regex.search(str)
print(stufficareabout.group())

I'm expecting to get 'start STUFF I CARE ABOUT end' as a result, but I'm instead getting 'STUFF start STUFF I CARE ABOUT end STUFF end STUFF end'. I thought regex.search returns the first match it finds, which to me would mean it would stop after the first "end" match, not keep going until the last one.

1 Answers1

0

You can use re.findall with ".*?":

import re
a, *_ = re.findall('start.*?end', 'STUFF start STUFF I CARE ABOUT end STUFF end STUFF end')

Output:

'start STUFF I CARE ABOUT end'
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • 1
    Umm just wondering, why did you answer this duplicate question when you could've used your dupe hammer to close the question as exact dupe? – Taku Jun 12 '18 at 02:58
  • @abccd I think any connection between the OP's question and the problem proposed in the prospective duplicate is vague at best. – Ajax1234 Jun 12 '18 at 03:10