0

The input is as follows

MY_PROJ10  
1st line  
2nd line  

MY_PROJ11  
3rd line  
4th line  


----------

Using Regular expressions I want to capture

result [0]  
Group 0 MY_PROJ10  
Group 1  
1st line  
2nd line

result [1]  
Group 0 MY_PROJ11  
Group 1  
3rd line  
4th line

My first crack at this is
regex = r"^(MY_PROJ.+)([\s\S]+)"
and then doing multi line
But this captures the all the following lines after MY_PROJ10
I am sure there way to do this in regular expressions.
I am trying with regex101.com but not luck as of yet

user3483203
  • 50,081
  • 9
  • 65
  • 94

2 Answers2

0

You may use (?m)^(MY_PROJ.*)([\s\S]*?)(?=[\n\r]MY_PROJ|\Z):

    In [2]: s = """
   ...: MY_PROJ10
   ...: 1st line
   ...: 2nd line
   ...:
   ...: MY_PROJ11
   ...: 3rd line
   ...: 4th line
   ...: """

In [3]: re.findall(r'(?m)^(MY_PROJ.*)([\s\S]*?)(?=[\n\r]MY_PROJ|\Z)', s)
Out[3]:
[('MY_PROJ10', '\n1st line  \n2nd line  \n'),
 ('MY_PROJ11', '\n3rd line  \n4th line\n')]
user3483203
  • 50,081
  • 9
  • 65
  • 94
  • 1
    thing of beauty and it works Chrisz. Now how do you learn these kind of tricky ones ? Any advice (other than posting in SO :) ) – hungry4code Apr 22 '18 at 02:21
  • This is a good starting point https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean – user3483203 Apr 22 '18 at 02:24
  • Also, for any regular expressions where you want to match *until* something, positive lookahead is your friend. – user3483203 Apr 22 '18 at 02:25
0

You can try this,

(?s)(MY_PROJ\d+)[\s]*((?:(?!MY_PROJ\d+).)+)

Demo

Thm Lee
  • 1,236
  • 1
  • 9
  • 12